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NAVAL  C3  DISTRIBUTED  TACTICAL  DECISIONMAKING 


1.  PROJECT  OBJECTIVES 

ft>5 

^  The  objective  of -die  research  is  to  address  analytical  and  computational  issues  that  arise  in  the 
modeling,  analysis  and  design  of  distributed  tactical  decisionmaking.  The  research  plan  has  been 
organized  into  two  highly  interrelated  research  areas: 


9-**  Distributed  Tactical  Decision  Processes; 
qJ&T  Distributed  Organization  Design. 


The  focus  of  the  first  area  is  the  development  of  methodologies,  models,  theories  and  algorithms 
directed  toward  the  derivation  of  superior  tactical  decision,  coordination,  and  communication 
strategies  of  distributed  agents  in  fixed  organizational  structures.  The  framework  for  this 
research  is  normative. 


The  focus  of  the  second  area  is  the  development  of  a  quantitative  methodology  for  the  evaluation 
and  comparison  of  alternative  organizational  structures  or  architectures.  The  organizations 
considered  consist  of  human  decisionmakers  with  bounded  rationality  who  are  supported  by 
systems.  The  organizations  function  in  a  hostile  environment  where  the  tempo  of  operations  is 
fast;  consequently,  the  organizations  must  be  able  to  respond  to  events  in  a  timely  mannpr.  Tb&- 
ffamework  for  this  research  is  descriptive,  fv-jXX'  'j 

2.  STATEMENT  OF  WORK 


The  research  program  has  been  organized  into  seven  technical  tasks  -  four  that  address  primarily 
the  theme  of  distributed  tactical  decision  processes  and  three  that  address  the  design  of  distributed 
organizations.  An  eighth  task  addresses  the  integration  of  the  results.  They  are: 

2. 1  Real  Time  Situation  Assessment:  Static  hypothesis  testing,  the  effect  of  human  constraints 
and  the  impact  of  asynchronous  processing  on  situation  assessment  tasks  will  be 
explored. 
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2.2  Real  Time  Resource  Allocation:  Specific  research  topics  include  the  use  of  algebraic 
structures  for  distributed  decision  problems,  aggregate  solution  techniques  and 
coordination. 

2.3  Impact  of  Informational  Discrepancy:  The  effect  on  distributed  decisionmaking  of 
different  tactical  information  being  available  to  different  decisionmakers  will  be  explored. 
The  development  of  an  agent  model,  the  modeling  of  disagreement,  and  the  formulation 
of  coordination  strategies  to  minimize  disagreement  are  specific  research  issues  within  this 

fa  sir 

2.4  Constrained  Distributed  Problem  Solving:  The  agent  model  will  be  extended  to  reflect 
human  decisionmaking  limitations  such  as  specialization,  limited  decision  authority,  and 
limited  local  computational  resources.  Goal  decomposition  models  will  be  introduced  to 
derive  local  agent  optimization  criteria.  This  research  will  be  focused  on  the  formulation 
of  optimization  problems  and  their  solution. 

2.5  Evaluation  of  Alternative  Organizational  Architectures:  This  task  will  address  analytical 
and  computational  issues  that  arise  in  the  construction  of  the  generalized 
performance-workload  locus.  This  locus  is  used  to  describe  the  performance 
characteristics  of  a  decisionmaking  organization  and  the  workload  of  individual 
decisionmakers. 

2.6  Asynchronous  Protocols:  The  use  of  asynchronous  protocols  in  improving  the  timeliness 
of  the  organization's  response  is  the  main  objective  of  this  task.  The  tradeoff  between 
timeliness  and  other  performance  measures  will  be  investigated. 

2.7  Information  Support  Structures:  In  this  task,  the  effect  of  the  C3  system  on  organizational 
performance  and  on  the  decisionmaker’s  workload  will  be  studied. 

2.8  Integration  of  Results:  A  final,  eighth  task,  is  included  in  which  the  various  analytical  and 
computational  results  will  be  interpreted  in  the  context  of  organizational  bounded 
rationality. 
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3.  STATUS  REPORT 


In  the  context  of  the  first  seven  tasks  outlined  in  Section  2,  a  number  of  specific  research 
problems  have  been  formulated  and  are  being  addressed  by  graduate  research  assistants  under  the 
supervision  of  project  faculty  and  staff.  Research  problems  which  were  completed  prior  to  or 
were  not  active  during  this  last  year  have  not  been  included  in  the  report. 

11  DISTRIBUTED  TEAM  HYPOTHESIS  i  fcSTiNG  WITH  SELECTIVE 
COMMUNICATIONS 

Background:  In  Command-and-Control-and-Communication(C3)  systems  multiple 
hypothesis-testing  problems  abound  in  the  surveillance  area.  Targets  must  be  detected  and  their 
attributes  must  be  established;  this  involves  target  discrimination  and  identification.  Some  target 
attributes,  such  as  location,  are  best  observed  by  sensors  such  as  radar.  More  uncertain  target 
locations  are  obtained  by  passive  sensors,  such  as  sonar  or  IR  sensors.  However,  target  identity 
information  requires  other  types  of  sensors  (such  as  ESM  receivers,  IR  signature  analysis, 
human  intelligence  etc).  As  a  consequence,  in  order  to  locate  accurate  and  identify  a  specific 
target  out  of  a  possibly  large  potential  population  (including  false  targets)  one  must  design  a 
detection  and  discrimination  system  which  involves  the  fusing  of  information  from  several 
different  sensors  generating  possibly  specialized  information  about  the  target  These  sensors  may 
be  collocated  on  a  platform  (say  a  ship  in  a  Naval  battle  group)  and  be  physically  dispersed  as 
well  (ESM  receivers  exist  in  every  ship,  aircraft,  and  submarine).  The  communication  of 
information  among  this  diverse  sensor  family  may  be  difficult  (because  of  EMCON  restrictions) 
and  is  vulnerable  to  enemy  countermeasure  actions  (physical  destruction  and  jamming).  It  is  this 
class  of  problems  that  motivates  our  research  agenda. 

Research  Goals:  We  are  conducting  research  on  distributed  multiple  hypothesis  testing  using 
several  decision-makers,  and  teams  of  decision-makers,  v-  ith  distinct  private  information  and 
limited  communications.  The  goal  of  this  research  is  to  unify  our  previous  research  in  situation 
assessment,  distributed  hypothesis  testing,  and  impact  of  informational  discrepancy;  and  to 
extend  the  methodology,  mathematical  theory  and  computational  algorithms  so  that  we  can 
synthesize  and  study  more  complex  organizational  structures.  The  solution  of  this  class  of  basic 


research  problems  will  have  impact  in  structuring  the  distributed  architectures  necessary  for  the 
detection,  discrimination,  identification  and  classification  of  attributes  of  several  targets  (or 
events)  by  a  collection  of  distinct  sensors  (or  dispersed  human  observers). 

The  objective  of  the  distributed  organization  will  be  the  resolution  of  several  possible  hypotheses 
based  cm  many  uncertain  measurements.  Each  hypothesis  will  be  characterized  by  several 
attributes.  Each  attribute  will  have  a  different  degree  of  observability  to  different  decision  makers 
or  teams  of  decision  makers;  in  this  manner,  we  shall  model  different  specialization  expertise 
associated  with  the  detection  ami  resolution  of  different  phenomena.  Since  each  hypothesis  will 
have  several  attributes,  it  follows  that  in  order  to  reliably  confirm  or  reject  a  particular  hypothesis, 
two  or  more  decision-makers  (or  two  or  more  teams  of  decision-makers)  will  have  to  pool  and 
fuse  their  knowledge. 

Extensive  and  unecessary  communication  among  the  decision-makers  will  be  discouraged  by 
explicitly  assigning  costs  to  certain  types  of  communication.  In  this  manner,  we  shall  seek  to 
understand  and  isolate  which  communications  are  truly  vital  in  the  organizational  performance; 
the  very  problem  formulation  will  discourage  communications  whose  impact  upon  performance 
is  minimal.  Quantitative  tradeoffs  will  be  sought 

Another  feature  which  will  be  incorporated  relates  to  the  vulnerability  of  the  distributed  decision 
process  to  enemy  countermeasures.  Thus,  in  our  distributed  decision  models  we  shall  assume 
that  there  is  a  finite  probability  that  the  actions  (decisions  and/or  conclusions)  of  any  one 
particular  decision-maker  will  be  distorted  or  destroyed  due  to  enemy  action.  As  a  consequence, 
the  organization  of  the  decision  teams,  the  protocols,  and  the  decision  rules  must  explicitly  take 
into  account  the  vulnerability  issue.  As  a  minimum,  a  certain  level  of  decision-making 
redundancy  must  exist  in  the  distributed  organization;  the  coordination  strategies  and  the 
protocols  that  isolate  "damaged"  decision-makers  will  be  developed.  We  shall  seek  to  determine, 
in  a  quantitative  setting,  the  minimum  required  level  of  decision-maker  redundancy  as  a  function 
of  the  degree  of  vulnerability  to  enemy  countermeasures  (such  as  jamming). 

We  stress  that  we  shall  strive  to  design  distributed  organizational  architectures  in  which  teams  of 
teams  of  decision-makers  interact.  For  example,  a  team  may  consist  of  a  primary  decision-maker 


together  with  a  consulting  decision-maker  --  the  paradigm  used  by  Papas tavrou  and  Athans. 


The  methodology  that  we  plan  to  employ  will  be  mathematical  in  nature.  To  the  extent  possible 
we  shall  formulate  the  problems  as  mathematical  optimization  problems.  Thus,  we  seek 
normative  solution  concepts.  To  the  extent  that  human  bounded  rationality  constraints  are 
available,  these  will  be  incorporated  in  the  mathematical  problem  formulation.  In  this  case,  the 
nature  of  the  results  will  correspond  to  what  is  commonly  refered  to  as  normative/descriptive 
solutions.  Therefore,  we  visualize  a  dual  benefit  of  our  basic  research  results.  From  a  purely 
mathematical  point  of  view,  the  research  will  yield  nontrivial  advances  to  the  distributed 
hypothesis-testing  problem;  an  extraordinary  difficult  problem  from  a  mathematical  point  of 
view.  From  a  psychological  perspective,  we  hope  that  the  normative  results  will  suggest 
counterintuitive  behavioral  patterns  of  -  even  perfectly  rational  --  decision-  makers  operating  in  a 
distributed  tactical  decision-making  environment;  these  will  set  the  stage  for  designing  empirical 
studies  and  experiments  and  point  to  key  variables  that  should  be  observed,  recorded  and 
analyzed  by  cognitive  scientists.  From  a  military  C3  viewpoint,  the  results  will  be  useful  in 
structuring  distributed  architectures  for  the  surveillance  function. 

Progress  to  Date:  Research  was  initiated  in  September  1987.  At  present  we  are  in  the  modeling 
and  problem  formulation  phase.  The  challenge  is  to  pose  the  problem  in  such  a  way  so  that  its 
generic  richness  is  preserved,  yet  having  a  chance  for  mathematical  solutions  which  will  provide 
insight 

We  have  developed  a  simple  model  for  capturing  the  effects  of  countermeasures.  Suppose  that 
we  have  a  decision-maker  that  makes  a  binary  decision,  i.e.  YES,  I  believe  that  I  see  a  target  vs 
NO,  I  do  not  believe  that  a  target  is  there.  We  can  have  a  small  but  finite  probability  that  when  the 
decision-maker  meant  to  say  YES  the  other  team  members  hear  NO,  and  vice  versa.  The  degree 
of  the  countermeasures  intensity  can  be  quantified  by  the  numerical  value  of  the  assigned 
probability.  This  way  of  modeling  the  impact  of  enemy  countermeasures  does  not  complicate  the 
mathematics  very  much  in  the  distributed  hypothesis-testing  algorithms. 

Progress  during  the  past  quarter.  In  the  past  quarter  we  focused  our  attention  to  the  problem  of 
ternary  hypothesis  testing  by  a  team  of  two  cooperating  decision  makers;  communication 


between  the  two  decision-makers  is  costly  and  consists  of  a  finite  alphabet.  The  problem  is  to 
distinguish  among  three  different  hypotheses.  Each  decision-maker  obtains  an  uncertain 
measurement  of  the  true  hypothesis.  The  so-called  primary  decision-maker  has  the  option  of 
making  the  final  team  decision  or  consulting,  at  a  cost,  the  consulting  decision-maker.  The 
consulting  decision-maker  is  constrained  to  provide  information  using  a  ternary  alphabet  The 
team  objective  is  to  minimize  the  probability  of  error  together  with  the  communications  cost  (if 
any). 

This  seemingly  simple  distributed  decision  problem  turns  out  to  have  an  extraordinarily  complex 
structure.  We  have  been  able  to  characterize  the  nature  of  the  optimal  solution;  however,  we  have 
not  obtained  as  yet  the  formal  mathematical  solution  and  associated  algorithms.  Nonetheless,  it  is 
possible  to  obtain  a  significant  insight  into  the  complexity  of  multiple  hypothesis-testing 
problems.  Also,  we  have  made  progress  in  pinpointing  what  we  mean  by  calling  a 
decision-maker  an  "expert"  in  some  hypotheses  and  a  "novice"  in  others.  These  are  critical  issues 
when  we  examine  more  complex  decision  organizations  with  several  members. 

Many  more  mathematical  models  and  tentative  approaches  remain  to  be  developed.  This  research 
will  most  probably  form  the  core  of  the  Ph.D.  research  of  J.  Papastavrou  under  the  supervision 
of  Professor  Athans. 

Documentation:  None  as  yet  A  presentation  is  in  preparation  for  delivery  at  the  C3  Symposium 
in  June  1987. 

3.2  DISTRIBUTED  HYPOTHESIS  TESTING  WITH  MANY  AGENTS 

Background:  The  goal  of  this  research  project  is  to  develop  a  better  understanding  of  the  nature 
of  the  optimal  messages  to  be  transmitted  to  a  central  command  station  (or  fusion  center)  by  a  set 
of  agents  who  receive  different  information  on  their  environment  In  particular,  we  are  interested 
in  solutions  of  this  problem  which  are  tractable  from  the  computational  point  of  view.  Progress 
in  this  direction  has  been  made  by  studying  the  case  of  a  large  number  of  agents. 
Normative/prescriptive  solutions  are  sought 
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Problem  Statement:  Let  Hq  and  Hj  be  two  alternative  hypotheses  on  the  state  of  the  environment 
and  let  there  be  N  agents  (sensors)  who  possess  some  stochastic  information  related  to  the  state 
of  the  environment  In  particular,  we  assume  that  each  agent  i  observes  a  random  variable  yj 

with  known  conditional  distribution  P(yjlHj),  j  =  0, 1,  given  either  hypothesis.  We  assume  that 

all  agents  have  information  of  the  same  quality,  that  is,  the  random  variables  are  identically 
distributed.  Each  agent  transmits  a  binary  message  to  a  central  fusion  center,  based  on  his 
information  yv  The  fusion  center  then  takes  into  account  all  messages  it  has  received  to  declare 

hypothesis  Hq  or  Hj  true.  The  problem  consists  of  determining  the  optimal  strategies  of  the 

agents  as  far  as  their  choice  of  message  is  concerned.  THis  problem  has  been  long  recognized  as 
a  prototype  problem  in  team  decision  theory:  It  is  simple  enough  so  that  analysis  may  be 
feasible,  but  also  rich  enough  to  allow  nontrivial  insights  into  optimal  team  decision  making 
under  uncertainty. 

Results:  This  being  studied  by  Prof.  J.  Tsitsiklis  and  a  graduate  student,  Mr.  George 
Polychronopoulos.  Under  the  assumption  that  the  random  variables  yj  are  conditionally 

independent  (given  either  hypothesis),  it  is  known  that  each  agent  should  choose  his  message 
based  on  a  likelihood  ratio  test.  Nevertheless,  we  have  constructed  examples  which  show  that 
even  though  there  is  a  perfect  symmetry  in  the  problem,  it  is  optimal  to  have  different  agents  use 
different  thresholds  in  their  likelihood  ratio  tests.  This  is  an  unfortunate  situation,  because  it 
severely  complicates  the  numerical  solution  of  the  problem  (that  is,  the  explicit  computation  of  the 
threshold  of  each  agent).  Still,  we  have  shown  that  in  die  limit,  as  the  number  of  agents  becomes 
large,  it  is  asymptotically  optimal  to  have  each  agent  use  the  same  threshold.  Furthermore,  there 
is  a  simple  effective  computational  procedure  for  evaluating  this  single  optimal  threshold. 


We  have  also  shown  that  if  each  agent  is  to  transmit  K-valued,  as  opposed  to  binary  messages, 
then  still  each  agent  should  use  the  same  decision  rule,  when  the  number  of  agents  is  large. 
Unfortunately,  however,  the  computation  of  this  particular  decision  rule  becomes  increasingly 
broader  as  K  increases. 


We  have  investigated  the  case  of  M-ary  (M  >  2)  hypothesis  testing  and  constructed  examples 
showing  that  it  is  better  to  have  different  agents  use  different  decision  rules,  even  in  the  limit  as 

N  — >  «*>.  Nevertheless,  we  have  shown  that  the  optimal  set  of  decision  rules  is  not  completely 
arbitrary.  In  particular,  it  is  optimal  to  partitition  the  set  of  agents  into  at  most  M(M- 1)/2  groups 
and,  for  each  group,  each  agent  should  use  the  same  decision  rule.  The  decision  rule 
corresponding  to  each  group  and  the  proportion  of  the  agents  assigned  to  each  group  may  be 
determined  by  solving  a  linear  programming  problem,  at  least  in  the  case  where  the  set  of 
possible  observations  by  each  agent  is  finite. 

In  more  recent  work,  the  following  have  been  accomplished. 

(a)  We  studied  the  Neyman-Pe arson  (as  opposed  to  Bayenian)  version  of  the  problem,  in  the 
case  of  M=2  hypothesis.  The  asymptotically  optimal  solution  has  been  found  and  involves 
the  Kullback-Liebler  information  distance. 

(b)  We  considered  a  class  of  symmetic  detection  problems  in  which  given  any  hypothesis  Hj , 

each  sensor  has  probability  e  of  making  an  observation  indicating  that  some  other  hypothesis 
Hj  is  true.  A  simple  numerical  procedure  has  been  found  which  completely  solves  this 
problem.  Furthermore,  a  closed  form  formula  for  the  optimal  decision  rules  has  been  found 

for  the  case  where  die  "noise  intensity"  e  is  very  small. 

Future  research  will  address  the  issue  of  the  validity  of  asymptotic  considerations  when  the 

number  of  agents  N  is  moderate  (N-5)  and  will  also  investigate  alternative  (more  complex) 
decision  making  architectures. 

Documentation 

[1]  J.  N.  Tsitsiklis,  "On  Threshold  Rules  in  Decentralized  Detection,"  Proc.  25th  IEEE 
Conference  on  Decision  and  Control,  Athens,  Greece,  December  1986;  also  LIDS-P-1570, 
Laboratory  for  Information  and  Decision  Systems,  MIT,  Cambridge,  MA,  June  1986. 

[2]  J.  N.  Tsitsiklis,  "Decentralized  Detection  by  a  Large  Number  of  Sensors,"  LIDS-P-1662, 
April  1987;  Submitted  to  Mathematics  of  Control,  Signals  and  Systems. 


3.3  COMMUNICATION  REQUIREMENTS  OF  DIVISiONALIZED  ORGANIZATIONS 


Background:  In  typical  organizations,  the  overall  performance  cannot  be  evaluated  simply  in 
terms  of  the  performance  of  each  subdivision,  as  there  may  be  nontrivial  coupling  effects 
between  distinct  subdivisions.  These  couplings  have  to  be  taken  explicitly  into  account;  one  way 
of  doing  so  is  to  assign  to  the  decisionmaker  associated  with  the  operation  of  each  division  a  cost 
function  which  reflects  the  coupling  of  his  own  division  with  the  remaining  divisions.  Still,  there 
is  some  freedom  in  such  a  procedure:  For  any  two  divisions  A  and  B  it  may  be  the  responsibility 
of  either  decisionmaker  A  or  decisionmaker  B  to  ensure  that  the  interaction  does  not  deteriorate 
the  performance  of  the  organization.  Of  course,  the  decisionmaker  in  charge  of  those  interactions 
needs  to  be  informed  about  the  actions  of  the  other  decisionmaker.  This  leads  to  the  following 
problem.  Given  a  divisionalized  organization  and  an  associated  organizational  cost  function, 
assign  cost  functions  to  each  division  of  the  organization  so  that  the  following  two  goals  are  met: 
a)  the  costs  due  to  the  interaction  between  different  divisions  are  fully  accounted  for  by  the 
subcosts  of  each  division;  b)  the  communication  interface  requirements  between  different 
divisions  are  small.  In  order  to  assess  the  communication  requirements  of  a  particular 
assignment  of  costs  to  divisions,  we  take  the  view  that  the  decisionmakers  may  be  modeled  as 
boundedly  rational  individuals,  that  their  decisionmaking  process  consists  of  a  sequence  of 
adjustments  of  their  decisions  in  a  direction  of  decreasing  costs,  while  exchanging  their  tentative 
decisi  with  other  decisionmakers  who  have  an  interest  in  those  decisions.  We  then  require 
that  there  are  enough  communications  so  that  this  iterative  process  converges  to  an 
organizationally  optimal  set  of  decisions. 

Problem  Statement:  Consider  an  organization  with  N  divisions  and  an  associated  cost  function 
J(x1,...,xjyj),  where  xj  is  the  set  of  decisions  taken  at  the  i-th  division.  Alternatively,  xj  may  be 
viewed  as  the  mode  of  operation  of  the  i-th  division.  The  objective  is  to  have  the  organization 
operating  at  a  set  of  decisions  (Xlf...,xjj)  which  are  globally  optimal,  in  the  sense  that  they 

minimize  the  organizational  cost  J.  We  associate  with  each  division  a  decisionmaker  DM;,  who 

is  in  charge  of  adjusting  the  decision  unables  xj.  We  model  the  decisionmakers  as  "boundedly 

rational"  individuals;  mathematically,  this  is  translated  to  the  assumption  that  each  decisionmaker 
will  slowly  and  iteratively  adjust  his  decisions  in  a  direction  which  reduces  the  organizational 


costs.  Furthermore,  each  decisionmaker  does  so  based  only  on  partial  knowledge  of  the 
organizational  cost,  together  with  messages  received  from  other  decisionmakers. 

n  . 

Consider  a  partition  J(xj,...,xj^)  =  £  J1(x1,...^cp^>  of  the  organizational  cost.  Each  subcost  J1 

i=l 

reflects  the  cost  incurred  to  the  i-th  division  and  in  principle  should  depend  primarily  on  Xj  and 
only  on  a  few  of  the  remaining  xj's.  We  then  postulate  that  the  decisionmakers  adjust  their 
decisions  by  means  of  the  following  process  (algorithm): 

(a)  DMj  keeps  a  vector  x  with  his  estimates  of  the  current  decision  x^  of  the  other 

decisionmakers;  also  a  vector  X  with  estimates  of  X^  =  djtySxj,  for  k  *  i.  (Notice  that  this 

partial  derivative  may  be  interpreted  as  DMj's  perception  of  how  his  decisions  affect  the  costs 
incurred  to  the  other  divisions. 

n 

(b)  Once  in  a  while  DMi  updates  his  decision  using  the  rule  xj:  =  Xj  =fc£  Xf,  (y  is  a  small 

positive  scalar)  which  is  just  the  usual  gradient  algorithm. 

(c)  Once  in  a  while  DMj  transmit  his  current  decision  to  other  decisionmakers. 

(d)  Other  decisionmakers  reply  to  DMj,  by  sending  an  updated  value  of  the  partial  derivative 
3jk/9xj. 

It  is  not  hard  to  see  that  for  the  above  procedure  to  work  it  is  not  necessary  that  all  DM's 
communicate  to  each  other.  In  particular,  if  the  subcost  Jj  depends  only  on  xj,  for  i,  there  would 
be  no  need  for  any  communication  whatsoever.  The  required  communications  are  in  fact 
determined  by  the  sparsity  structure  of  the  Hessian  matrix  of  the  subcost  functions  Jj  Recall 
now  that  all  that  is  given  is  the  original  cost  function  J;  we  therefore,  have  freedom  in  choosing 
the  Jj's  and  we  should  be  able  to  do  this  in  a  way  that  introduces  minimal  communication 

requirements;  that  is,  we  want  to  minimize  the  number  of  pairs  of  decisionmakers  who  need  to 
communicate  to  each  other. 
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The  above  problem  is  a  prototype  organizational  design  problem  and  we  expect  that  it  will  lead  to 
reasonable  insights  in  good  organizational  structures.  On  the  technical  side,  it  may  involve 
techniques  and  mols  from  graph  theory.  Once  the  above  problem  is  understood  and  solved,  the 
next  step  is  to  analyze  communication  requirements  quantitatively.  In  particular,  a  distributed 
gradient  algorithm  such  as  the  one  introduced  above  converges  only  if  the  communication 
(between  pairs  of  DM’s  who  need  to  communicate)  is  frequent  enough.  We  will  then  investigate 
the  required  frequencies  of  communication  as  a  function  of  the  strenght  of  coupling  between 
different  divisions. 

Progress  to  Date:  A  graduatre  student,  C.  Lee,  supervised  Prof.  J.  Tsitsiklis,  has  undertaken  the 
task  of  task  of  formulating  the  problem  of  finding  partitions  that  minimize  the  number  of  pairs  of 
DM's  who  need  to  communicate  to  each  other  as  the  topic  of  his  SM  research.  It  was  realized 
that  with  a  naive  formulation  the  optimal  allocation  of  responsibilities,  imposing  minimal 
communication  requirements,  corresponds  to  the  centralization  of  authority.  Thus,  in  order  to 
obtain  more  realistic  and  meaningful  problems  we  are  incorporating  a  constraint  requiring  that  not 
agent  should  be  overloaded.  A  number  of  results  have  been  obtained  for  a  class  of  combinatorial 
problems,  corresponding  to  the  problem  of  optimal  organizational  design,  under  limited 
communications.  In  particular  certain  special  cases  were  solved  and  other  special  cases  have 
been  successfully  reformulated  as  linear  network  flow  of  assignment  problems,  for  which 
efficient  algorithms  are  known.  As  simulation  study  is  underway  to  validate  the  hypothesis  that 
better  task  allocation  results  into  better  convergence. 

Documentation:  The  Master's  thesis  of  Mr.  C.  Lee  will  be  ready  by  August  1987. 


3.4  COMMUNICATION  COMPLEXITY  OF  DISTRIBUTED  CONVEX  OPTIMIZATION 


Background:  The  objective  of  this  research  effort  is  to  quantify  the  minimal  amount  of 
information  that  has  to  be  exchanged  in  an  organization,  subject  to  the  requirement  that  a  certain 
goal  is  accomplished,  such  as  the  minimization  of  an  organizational  cost  function.  The  problem 


becomes  interesting  and  relevant  under  the  assumption  that  no  member  of  the  organization 
"knows"  the  entire  function  being  minimized,  but  rather  each  agent  has  knowledge  of  only  a 
piece  of  die  cost  function.  A  normative/prescriptive  solution  is  sought 

Problem  Formulation:  Let  f  and  g  be  convex  function  of  n  variables.  Suppose  that  each  one  of 
two  agents  (or  decision  makes)  knows  the  function  f  (respectively  g) ,  in  the  sense  that  he  is  able 
to  compute  instantly  any  quantities  associated  with  this  function.  The  two  agents  are  to  exchange 
a  number  of  binary  messages  until  they  are  able  to  determine  a  point  x  such  that  f(g)  +  g(x) 

comes  within  e  of  the  minimum  of  f+g,  where  £  is  some  prespecified  accuracy.  The  objective  is 

to  determine  the  minimum  number  of  such  messages  that  have  to  be  exchanged,  as  a  function  of  £ 
and  to  determine  communication  protocols  which  use  no  more  messages  than  the  minimum 
amount  required. 

Results:  The  problem  is  being  studied  by  Professor  John  Tsitsiklis  and  a  graduate  student, 

Zhi-Quan  Luo.  We  have  shown  that  a  least  0(nlog  1/E)  messages  are  needed  and  a  suitable 

approximate  and  distributed  implementation  of  ellispsoid-type  algorithms  work  with  0(n2log2l/£) 
messages.  The  cnallenge  is  to  close  this  gap.  This  has  been  accomplished  for  the  case  of 

one-dimensional  problem  n=l,  for  which  it  has  been  shown  that  Oflog  1/e)  messages  are  also 
sufficient.  More  recently,  we  have  succeeded  in  generalizing  the  technique  employed  in  the 
one-dimensional  case,  and  we  obtained  an  algorithm  which  is  optimal,  as  far  as  the  dependence 

of  £  is  concerned.  The  question  of  the  dependence  of  the  amount  of  communications  on  the 
dimension  of  the  problem  (0(n)  versus  0(n2))  seems  to  be  a  lot  harder  and,  at  present,  there  are 
no  available  techniques  for  handling  it 

An  interesting  qualitative  feature  of  the  communication-optimal  algorithms  discovered  thus  far  is 
the  following:  It  is  optimal  to  transmit  aggregate  information  (the  most  significant  bits  of  the 
gradient  of  the  function  optimized)  in  the  beginning;  then,  as  the  optimum  is  approached  more 
refined  information  should  be  transfered.  This  very  intuitive  result  seems  to  correspond  to 
realistic  situtations  in  human  decisionmaking.  Another  problem  which  is  currently  being 
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investigated  concerns  the  case  where  are  K  >  2  decisionmakers  cooperating  for  the  minimization 
of  Fj  +...+  ffc  where  each  fj  is  again  a  convex  function. 

This  problem  turns  out  to  be  very  hard,  but  some  progress  has  been  made  on  a  simpler  version  of 
the  problem.  Namely,  we  considered  the  problem  of  evaluating  a  simple  function  (say  the  sume 
of  K  numbers)  by  a  hierarchy  (tree)  of  decisionmakers  and  tight  bounds  have  been  obtained  on 
the  required  amount  of  communication. 

Documentation: 

[1]  J.  N.  Tsitsiklis  and  Z.-Q.  Luo,  "Communication  Complexity  of  Convex  Optimization," 
LIDS-P-1617,  Laboratory  for  Information  and  Decision  Systems,  MTT.October  1986;  Proc. 
25th  IEEE  Conference  on  Decision  and  Control ,  Athens,  Greece,  December,  1986;  This 
paper  has  been  submitted  to  the  Journal  of  Complexity,  also  an  invited  talk  was  given  at  the 
2nd  Symposium  on  Complexity  of  Approximately  Solved  Problems,  Columbia  University, 
New  York,  April  1987. 

3.5  DISTRIBUTED  ORGANIZATION  DESIGN 

Background:  The  bounded  rationality  of  human  decisionmakers  and  the  complexities  of  die  tasks 
they  must  perform  mandate  the  formation  of  organizations.  Organizational  architectures 
distribute  the  decisionmaking  workload  among  the  members:  different  architectures  impose 
different  individual  loads  and  result  in  different  organizational  performance.  Two  measures  of 
organizational  performance  are  accuracy  and  timeliness.  The  first  measure  of  performance 
addresses  in  part  the  quality  of  the  organization's  response.  The  second  measure  reflects  the  fact 
that  in  tactical  decisionmaking  when  a  response  is  generated  is  also  significant:  the  ability  of  an 
organization  to  cany  out  tasks  in  a  timely  manner  is  a  determinant  factor  of  effectiveness. 

The  scope  of  work  was  divided  into  three  tasks: 

(a)  Evaluation  of  Alternative  Organizational  Architectures; 

(b)  Asynchronous  Protocols;  and 

(c)  Information  Support  Structures. 


During  this  past  year,  the  research  effort  was  organized  around  three  foci.  In  the  first  one,  we 
continued  to  work  on  the  development  of  analytical  and  algorithmic  tools  for  the  analysis  and 
design  of  organizations.  In  the  second,  the  focus  was  integration  of  the  results  obtained  thus  far 
through  the  development  of  a  workstation  for  die  design  and  analysis  of  alternative  organizational 
architectures.  Finally,  an  experimental  program  was  initiated  with  the  objective  of  collecting  data 
necessary  to  calibrate  the  models  and  evaluate  different  architectures  for  distributed 
decisionmaking. 

3.5.1  Design  and  Evaluation  of  Alternative  Organizational  Architectures. 


In  order  to  design  an  organization  that  meets  some  performance  requirements,  we  need  to  be  able 
to  do  the  following: 

(a)  Articulate  the  requirements  in  qualitative  and  quantitative  terms; 

(b)  Generate  candidate  architectures  that  meet  some  of  the  requirements; 

(c)  Evaluate  the  candidate  organizations  with  respect  to  the  remaining  requirements; 

(d)  Modify  the  designs  so  as  to  improve  the  effectiveness  of  the  organization; 

The  generalized  Performance  Workload  locus  has  been  used  as  the  means  for  expressing  both 
the  requirements  that  the  organization  designer  must  meet  and  the  performance  characteristics  of 
any  specific  design.  Consider  an  organization  with  N  decisionmakers.  Then  the  Performance 
Workload  space  is  an  N+2  dimensional  space  in  which  two  of  the  dimensions  correspond  to  the 
measures  of  the  organization's  performance  (say,  accuracy  and  timeliness)  and  the  remaining  N 
dimensions  correspond  to  the  measure  of  the  workload  of  each  individual  decisionmaker.  Two 
loci  can  be  defined.  First,  the  Requirements  locus  is  the  set  of  points  in  this  N+2  dimensional 
space  that  satisfy  the  performance  and  workload  requirements  associated  with  the  task  to  be 
performed  by  the  organization.  The  second,  the  System  locus,  is  the  set  of  points  that  are 
achievable  by  a  particular  design.  The  design  problem  can  then  be  conceptualized  as  the 
reshaping  and  repositioning  of  the  System  locus  in  the  Performance  Workload  space  so  that  the 
requirements  are  met 


Several  thesis  projects  were  undertaken  during  this  period.  The  individual  problem  statements 
and  a  decription  of  the  progress  to  date  follow: 


Generation  of  Flexible  Organbatinnal  Structures 

Problem  Statement:  Develop  a  method  for  generating  organizational  forms  that  satisfy  some 
structural  and  some  application  specific  constraints. 

Results:  This  problem  has  been  addressed  by  P.  Remy  under  the  supervision  of  Dr.  A.  H. 
Levis.  The  first  step  in  the  procedure  was  the  definition  of  the  Petri  Net  and  the  corresponding 
data  structure  for  the  interacting  decisionmaker.  In  the  past,  information  sharing  was  allowed 
only  between  the  situation  assessment  stage  and  the  information  fusion  process.  This 
assumption  has  been  relaxed  to  allow  four  different  forms  of  information  sharing  -  each  form 
depends  on  the  source  of  the  information  (e.g.,  is  one  DM  informing  the  other  of  his  situation 
assessment  or  of  his  response?)  and  on  the  destination.  For  example,  the  situation  assessment 
of  one  DM  may  be  the  input  to  the  next  one  in  a  serial  or  hierarchical  organization.  After  defining 
the  set  of  possible  interactions,  a  combinatorial  problem  could  be  formulated.  The  dimensionality 
of  this  problem  is  prohibitive,  if  no  constraints  on  the  structure  are  imposed.  There  are 
2^n(2n-l)  organizational  forms  in  this  formulation,  where  n  is  the  number  of  decisionmakers. 
These  organizational  forms  are  called  Well  Defined  Nets  (WDNs)  of  dimension  n.  An 
algorithmic  approach  has  been  developed  that  reduces  the  problem  to  a  computationally  tractable 
one. 

A  series  of  propositions,  proved  by  Remy,  set  the  theoretical  basis  of  the  algorithm.  These 
propositions  constitute  significant  extensions  of  Petri  Net  Theory.  The  first  proposition 
establishes  that  if  the  source  and  the  sink  places  of  a  Petri  Net  representing  a  WDN  are  combined 
into  a  single  place  and  if  the  resulting  Petri  Net  is  strongly  connected,  then  it  is  an  event  graph  (a 
special  class  of  Petri  Nets). 

Then,  two  sets  of  constraints  are  introduced  to  eliminate  unrealistic  organizational  forms.  The 
first  set,  structural  constraints,  define  what  kinds  of  interactions  between  decisionmakers  must 
be  ruled  out  User-defined  constraints  allow  the  designer  to  introduce  specific  structural 


characteristics  that  are  appropriate  (or  are  mandated)  for  the  particular  design  problem. 


The  first  structural  constraint  imposes  a  minimum  degree  of  connectivity  in  the  organization;  it 
eliminates  structures  that  do  not  represent  a  single  integrated  organization  and  ensures  that  the 
flow  of  information  is  continuous.  The  second  constraint  allows  acyclical  organizations  only. 
This  restriction  is  made  to  avoid  deadlock  and  the  circulation  of  messages.  The  third  constraint 
prohibits  erne  decisionmaker  from  sending  the  same  data  to  different  stages  of  another 
decisionmaker's  model.  This  is  a  technical,  model-specific  restriction  that  recognizes  the  fact  that 
the  stages  of  decisionmaking  are  a  modeling  artifice  that  should  not  introduce  extraneous 
complexity.  The  last  constraint  restricts  the  situation  assessment  stage  to  receiving  a  single  input; 
multiple  inputs  can  be  received  at  the  information  fusion  stage. 

The  user-defined  constraints  are  arbitrary;  they  reduce  the  degrees  of  freedom  in  the  design 
process.  A  WDN  that  satisfies  the  user-defined  constraints  is  called  an  Admissible 
Organizational  Form.  An  admissible  form  that  also  satisfies  the  structural  constraints  is  a 
Feasible  OrpmiTaHon. 

The  second  proposition  characterizes  formally  the  admissible  organizational  forms  as  subsets  of 
the  set  of  WDNs.  Furthermore,  it  introduces  the  concept  of  maximal  and  minimal  elements  of  the 
sets.  A  maximal  element  of  the  set  of  Feasible  Organizations  is  called  a  Maximally  Connected 
Organization  (MAXO)  while  a  minimal  one  is  called  a  Minimally  Connected  Organization 
(MINO). 

The  third  proposition  establishes  that  any  feasible  organization  is  bounded  from  above  by  at  least 
one  MAXO  and  from  below  by  a  least  one  MINO. 

With  this  characterization  of  the  feasible  structures,  what  remains  is  to  develop  a  procedure  for 
generating  them.  The  procedure  is  based  on  the  concept  of  simple  paths  developed  by  Jin  (or 
equivalently,  on  the  s-invariants  of  Petri  Net  theory).  The  fourth  and  fifth  propositions  lead  to 
the  algorithm  for  generating  feasible  organizations.  They  show  that  one  can  construct  the  set  of 
all  the  possible  unions  of  simple  paths.  Then  one  can  determine  all  the  MAXOs  and  the  MINOs 
of  the  set  These  MAXOs  and  MINOs  bound  the  solution  set  Any  feasible  organization  form  is 


a  subset  of  a  MAXO  and  has  one  or  more  MINOs  as  subsets.  By  adding  simple  paths  to  every 
MINO  until  a  MAXO  is  reached,  one  can  construct  the  complete  set  of  Feasible  Organizations. 

This  is  a  powerful  result,  both  theoretically  and  computationally,  that  opens  the  way  for 
generating  classes  of  feasible  organizational  forms  that  meet,  a  priori,  some  structural  and 
performance  requirements.  The  partial  ordering  of  the  solutions  (another  result  established  by 
Remy)  allows  the  use  of  lattice  theory  to  analyze  the  properties  of  various  architectures. 

The  work  of  Remy  considered  organizations  with  fixed  structures:  the  decisions  made  by  the 
organization  members  affected  the  processing  of  the  task  and  resulted  in  different  responses,  but 
did  not  affect  the  structure  of  the  organization.  The  next  step  is  the  consideration  of  flexible 
organizations  in  which  the  realized  structure  at  any  instant  depends  on  the  task  that  is  being 
performed  and  the  decisions  being  made.  Jean  Marc  Monguillet,  under  the  supervision  of  Dr.  A. 
H.  Levis,  has  began  to  investigate  this  question.  At  this  point,  the  focus  of  the  research  is  on 
understanding  the  meaning  of  the  term  "flexible  architecture”  and  on  the  identification  of 
appropriate  mathematical  tods  for  the  description  of  such  architectures. 


Documentation: 


[1]  P.  A.  Remy,  "On  the  Generation  of  Organizational  Architectures  Using  PEtri  Nets,” 
LIDS-TH-1630,  MS  Thesis,  Laboratory  for  Information  and  Decision  Systems,  MIT, 
Cambridge,  MA,  December  1986. 

[2]  P.A.  Remy,  A.H.  Levis,  V.Y.-Y.  Jin,  "On  the  Design  of  Distributed  Organizational 
Structures,"  Proc.  10th  1FAC  World  Congress,  Munich,  FRG,  July  1987;  also  accepted  for 
publication  in  Automatica. 

[3]  P.A.  Remy  and  A.  H.  Levis,  "On  the  Generation  of  Organizational  Architectures  Using  Petri 
Nets,"  LIDS -P-1634,  Laboratory  for  Information  and  Decision  Systems,  MIT,  Cambridge, 
MA,  December  1986;  accepted  for  presentation  at  the  8th  European  Symposium  on 
Applications  and  Theory  of  Petri  Nets,  Zaragoza,  Spain,  24-27  June,  1987. 


Design  of  Oroanlxarinns 


Objective:  Given  a  feasible  organizational  architecture,  develop  a  methodology  for  (a)  identifying 
the  functions  that  must  be  performed  by  the  organization  in  order  that  the  task  be  accomplished, 
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(b)  selecting  the  resources  (human,  hardware,  software)  that  are  required  to  implement  these 
functions,  and  (c)  integrating  these  resources  -  through  interactions  -  so  that  the  system  operates 
effectively. 

Progress  to  Date:  This  research  problem  is  being  investigated  by  S  tamos  K.  Andre  adakis  under 
the  supervision  of  Dr.  A.  H.  Levis.  The  design  methodology  has  been  modified  in  order  to 
address  die  following  formulation  of  the  design  problem  of  decisionmaking  organizations:  Given 
a  mission,  design  the  DM  organization  that  is  accurate,  timely,  exhibits  a  task  processing  rate  that 
is  higher  than  the  task  arrival  rate  and  whose  decisionmakers  are  not  overloaded.  The  design 
requirements  explicitly  stated  are: 

The  accuracy  J  must  be  greater  than  a  threshold  J0  or,  equivalently,  that  the  expected  cost  J  be 
less  than  the  threshold  JQ: 

J<J0  [1] 

The  timeliness  measure  T  be  less  than  a  threshold  T0: 

T<T0  [2] 

The  task  processing  rate  R  be  greater  than  the  task  arrival  rate 

R>Ro  [3] 

The  constraints  that  must  be  observed  are:  each  decisionmaker  must  not  be  overloaded,  i.e.,  a 
decisionmakers'  information  processing  rate  F  be  less  than  the  rationality  threshold  F0: 

F  <  F0  [4] 

The  proposed  design  methodology  has  two  stages: 

In  the  first  stage  the  Petri  Net  of  the  data  flow  is  constructed.  Each  function  is  represented  by  a 
transition,  while  the  associated  data  (information)  and  constraints  are  represented  by  places.  This 
Petri  Net  depicts  the  flow  of  information  from  function  to  function,  as  well  as  the  parallel 
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(concurrent)  and  serial  operations. 


Since  a  multitude  of  data  flow  architectures  can  be  constructed  for  a  particular  mission,  it  is 
necessary  to  classify  them  in  order  to  keep  the  design  problem  tractable.  Current  research 
focuses  on  the  classification  of  data  flow  architectures  and  on  the  selection  of  a  representative 
structure  for  each  data  flow  class. 

The  objective  of  the  first  stage  is  to  compute  an  estimate  of  the  processing  rate  range  and  to  select 
the  number  of  processing  channels  in  order  to  satisfy  the  processing  rate  requirement  In  order  to 
satisfy  the  workload  constraints,  the  total  activity  of  each  function  in  the  Petri  Net  is  computed, 
using  Information  Theory.  Then  a  representative  (average)  value  for  the  information  processing 
capacity  of  die  human  is  selected  and  the  expected  execution  time  of  each  function  is  computed  by 
dividing  the  total  activity  by  the  processing  capacity.  In  all  subsequent  calculations  of  the 
information  processing  rate,  response  time,  and  timeliness  measures,  these  processing  time 
estimates  are  used.  Thus,  the  workload  constraints  will  always  be  satisfied. 

The  maximum  and  minimum  processing  rates  of  the  organization  can  be  computed  as  follows: 

The  processing  rate  of  each  transition  is  computed  by  dividing  the  information  processing 
capacity  by  the  total  activity  of  the  function  that  is  represented  by  the  transition.  Assuming  that 
each  transition  is  assigned  to  a  different  decisionmaker,  the  maximum  processing  rate  of  the 
Decision  Making  Organization  (DMO)  being  designed  is  equal  to  the  minimum  of  the  rates  of  the 
individual  transitions. 

An  estimate  of  the  minimum  processing  rate  is  obtained  as  follows: 

The  expected  processing  time  corresponding  to  each  information  flow  path  is  computed  by 
summing  the  expected  processing  times  of  its  transitions.  The  rate  estimate  is  the  inverse  of  the 
maximum  expected  processing  time.  The  actual  minimum  rate  can  be  even  lower  due  to 
communications  delays. 

If  the  processing  rate  is  smaller  than  the  input  rate,  multiple  processing  channels,  which  are 


copies  of  the  basic  information  flow  net,  must  be  introduced  so  that  the  arriving  tasks  can  be 
assigned  to  alternate  processing  channels  of  the  DMO.  The  number  of  the  alternate  channels  is 
computed  by  dividing  the  input  rate  by  the  processing  rate  and  rounding  up  to  an  integer  value. 
Consequently,  the  processing  rate  requirements  will  be  satisfied. 

In  the  second  stage,  the  Petri  Net  depicting  the  information  processing  is  augmented  and  is 
transformed  into  the  Petri  Net  of  the  DMO.  This  Petri  Net  delimits  the  functions  performed  by 
each  decisionmaker  by  introducing  resource  places  that  represent  the  constraints  on  the 
decisionmakers  to  perform  one  function  at  any  time.  Each  of  these  places  is  connected  so  that  it 
is  the  output  of  the  last  and  the  input  of  the  first  transition  allocated  to  the  decisionmaker. 

This  Petri  Net  also  depicts  the  communications  among  members  of  the  DMO  by  representing 
each  communication  process  by  a  transition  and  the  respective  protocols  using  the  appropriate 
places  and  connectors.  When  allocating  functions  to  decisionmakers  the  following  sets  of 
constraints  must  be  met: 

The  functions  allocated  to  any  decisionmaker  must  observe  the  input-output  relationships 
imposed  by  the  Petri  Net  and  must  process  data  pertinent  to  the  same  subtask  of  the  DM 
organization.  They  must  also  belong  to  different  time  zones  or  slices  of  the  Petri  Net,  i.e.,  they 
must  process  data  at  different  times. 

Transitions  belonging  to  an  information  flow  path  observe  the  input-output  relationships  and  are 
executed  sequentially.  Thus  they  satisfy  both  of  the  above  constraints.  Transitions  on  different 
flow  paths  violate  the  first  requirement  (input-output).  Thus  the  feasible  solutions  to  the  function 
allocation  problem  are  the  sets  of  functions  that  correspond  to  the  information  flow  paths. 
Consequently,  the  functions  of  each  information  flow  path  are  allocated  to  a  different 
decisionmaker. 

In  general,  the  functions  represented  by  the  transitions  require  different  specialization  from  the 
decisionmakers.  Hence  when  considering  alternate  sets  of  functions,  the  respective  training 
requirements  must  be  considered.  Due  to  specialization  requirements,  it  may  be  necessary  to 
allocate  some  of  the  transitions  of  an  information  flow  path  to  one  decisionmaker  and  the 


remaining  to  another. 


Another  consideration  is  the  sensitivity  of  the  timeliness  measure  to  communications  jamming. 
Consequently,  function  allocation  resulting  in  having  two  or  more  communication  links  on  any 
information  flow  path  is  more  sensitive  to  communications  jamming  than  that  resulting  in  only 
one  communication  link  on  an  information  flow  path. 

To  evaluate  a  design,  the  accuracy  of  response,  the  expected  response  time,  and  the  processing 
rate  of  the  DMO  are  computed  for  all  decision  strategies.  Then  a  Measure  of  Effectiveness  of  the 
DMO  is  defined  in  the  strategy  space  as  the  ratio  of  decision  strategies  that  satisfy  the 
requirements  to  the  total  number  of  decision  strategies.  If  the  MOE  value  is  satisfactory  then  the 
design  is  accepted;  else  the  design  is  modified  until  a  satisfactory  MOE  value  is  obtained. 

Documentation: 

[1]  A.  H.  Levis  and  S.  K.  Andreadakis,  "Computer-Aided  Analysis  of  Organizations,"  Proc. 
25th  IEEE  Coherence  on  Decision  and  Control ,  Athens,  Greece,  December  1986. 

[2]  S.K.  Andreadakis  and  A.  H.  Levis,  "Accuracy  and  Timeliness  in  Decision-Making 
Organizations,"  Proc.  10th  IF  AC  World  Congress,  July  27-31, 1987,  Munich,  FRG. 

Presentations: 

[1]  S.  K.  Andreadakis  and  A.  H.  Levis,  Accuracy  and  Timeliness  in  Decision-Making 
Organizations,"  9th  MIT/ONR  Workshop  on  C3  Systems ,  June  1986,  Monterey,  California. 

[2]  S.  K.  Andreadakis  and  A.  H.  Levis,  "Design  Methodology  for  Decision-Making 
Organizations,"  C3  Symposium,  June  1987,  Washington  DC. 

The  ability  of  a  distributed  tactical  decisionmaking  organization  to  carry  out  its  tasks  in  a  timely 
manner  depends  on  two  types  of  constraints.  The  first  type  is  related  to  the  internal  organizational 
structure  that  determines  how  the  various  operations  occur  in  the  process:  some  tasks  are 
processed  sequentially,  while  others  are  processed  concurrently.  The  sequential  and  concurrent 
events  are  coordinated  by  the  communication  and  execution  protocols  among  the  individual 
organization  members. 
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The  second  type  includes  time  and  resource  constraints.  The  time  constraints  derive  from  the 
task  execution  times  -  the  time  necessary  to  perform  each  task.  The  organization  also  has  limited 
resources;  depending  on  which  of  the  resources  are  available  at  a  given  instant,  some  activities 
can  take  place  while  others  must  be  delayed.  The  Petri  Net  formalism  provides  a  convenient  tool 
for  analyzing  the  behavior  of  organizations  with  asynchronous  protocols  that  allow  for 
concurrent  processing. 


I K*.  1 1*, 


f  Organizations  with  Decision  Aids 


In  the  work  of  Remy  and  Andreadakis,  the  organizations  consist  of  humans  alone.  Alternatively, 
the  effect  of  decision  aids  is  subsumed  in  the  model  of  the  organization  member.  However,  in 
considering  command  and  control  systems,  it  is  necessary  that  the  contributions  and  effects  of 
decision  aids  be  made  explicit  Jean  Louis  Grevet,  under  the  supervision  of  Dr.  A.  H.  Levis,  has 
started  a  thesis  project  that  will  try  to  build  on  the  case  study  done  by  S.  Weingaertner  in  his 
thesis  and  develop  a  methodology  for  the  modeling  and  analysis  of  decision  aids  in  an 
organization. 


mce  Evaluation  of  Organizations  with  Asynchronous  Protocols 


Problem  Statement:  In  earlier  work  by  Jin,  the  response  time  of  a  decisionmaking  organization 
was  computed  using  an  algorithm  based  on  the  Petri  Net  representation.  The  definition  of 
response  time  was  the  time  interval  between  the  moment  a  stimulus  is  received  by  the 
organization  and  the  moment  a  response  is  made.  This  measure  of  performance  is  a  static 
measure  insofar  that  it  assumes  that  there  are  no  other  tasks  being  processed  by  the  organization. 
A  more  realistic  estimate  of  response  time  will  be  obtained,  if  the  dynamic  behavior  of  the 
organization  is  taken  into  account.  More  precisely,  the  research  problem  is  to  evaluate  the 
performance  of  DMO  with  respect  to  the  following  time-related  measures: 

(a)  Maximum  Throughput  Rate:  This  is  the  maximum  rate  at  which  external  inputs  can  be 
processed;  a  higher  rate  would  lead  to  the  formation  of  queues  of  unbounded  length. 


(b) 


,:  Let  processing  of  arriving  inputs  start  at  tg  and  let  the  inputs  be 


processed  at  the  maximum  throughput  rate.  The  earliest  isntants  of  time  at  which  the  various 
tasks  can  be  performed  in  the  repetitive  process  constitute  die  optimum  execution  schedule; 
any  other  schedule  will  lead  to  longer  response  times. 

Results:  The  time-related  performance  of  a  DMO,  as  measured  by  the  maximum  throughput  rate 
and  the  execution  schedule,  has  been  analyzed  and  evaluated  by  Herve  P.  Hillion  under  the 
supervision  of  Dr.  A.  H.  Levis.  The  approach  was  based  on  modeling  the  DMO  as  a  Timed  Petri 
Net  Two  constraints  have  been  modeled  to  characterize  the  bounded  rationality  of  human 
decisionmakers.  The  time  associated  with  individual  processes  reflects  a  processing  rate 
limitation,  while  the  resource  limitation  models  the  limited  capacity  of  short-term  memory,  which 
bounds  the  amount  of  information  that  a  DM  can  handle  at  the  same  time.  Both  considerations 
are  modeled  as  a  constraint  on  the  total  number  of  inputs  that  can  be  processed  simultaneously 

The  maximum  throughput  rate  has  been  expressed  as  a  function  of  the  resource  and  time 
constraints  in  the  following  manner  The  inclusion  of  the  resource  constraints  in  the  Petri  Net 
model  results  in  directed  circuits  (or  loops)  which  are  characterized  by:  (a)  die  circuit  processing 

time,  p,  defined  as  the  sum  of  the  different  task  processing  times  of  the  circuit  p  represents  the 
amount  of  time  it  takes  one  input  to  complete  the  processing  operations  of  the  circuit;  (b)  the 
resources  available,  n,  which  bound  the  total  number  of  inputs  that  can  be  processed  at  the  same 
time  in  the  circuit 


For  a  given  circuit,  the  ratio  n/p  characterizes  the  average  circuit  processing  rate.  The  minimum 
average  circuit  processing  rate,  taken  over  all  the  directed  circuits  of  the  net,  determines  the 
maximum  throughput  rare  of  the  deterministic  systems,  i.e.,  when  all  the  task  processing  times 
are  deterministic.  For  the  case  of  stochastic  processing  times,  an  upper  bound  is  obtained  for  the 
maximum  throughput  rate.  In  that  case,  the  average  circuit  processing  time  can  be  computed. 
The  determination  of  the  critical  circuits,  for  which  the  corresponding  average  processing  rate  is 
minimal,  provides  a  clear  way  of  comparing  different  organizations.  These  critical  circuits  are  the 
ones  that,  because  of  the  time  and  resource  constraints,  bound  the  throughput  rare.  Therefore, 
there  is  now  a  direct  way  to  identify  how  different  constraints  affect  organizational  performance. 
Consequently,  the  problem  of  modifying  the  right  constraints  so  as  to  improve  the  performance 
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of  the  organizations  (and  meet  mission  requirements)  becomes  transparent 


A  method  for  obtaining  and  analyzing  the  exact  execution  schedule  when  processing  times  are 
deterministic  has  been  developed.  A  representation,  defined  by  the  slices  of  the  Petri  Net  allows 
for  the  precise  characterization  of  the  causal  relations  in  the  DMO.  The  causal  relationships  result 
in  the  partial  ordering  of  the  different  operations.  The  execution  schedule  so  obtained  determines 
the  earliest  instants  at  which  the  various  tasks  can  be  executed  in  real-time  for  a  process  that 
occurs  repetitively. 


The  contribution  of  this  research  is  that  it  develops  two  MOPs  that  characterize  the  time-related 
behavior  of  a  distributed  tactical  decisionmaking  organization.  Furthermore,  the  concepts  and 
algorithms  developed  arc  oriented  toward  design:  they  indicate  which  design  parameters  need  to 
be  changed  to  meet  requirements. 


Documentation: 

[1]  H.  P.  Hillion,  "Performance  Evaluation  of  Decisionmaking  Organizations  Using  Timed  Petri 
Nets,"  LIDS-TH-1590,  MS  Thesis,  Laboratory  for  Information  and  Decision  Systems,  MIT, 
Cambridge,  MA,  August  1986. 

[2]  H.  P.  Hillion  and  A.  H.  Levis,  "Timed  Event-Graph  and  Performance  Evaluation  of 
Systems,"  LIDS-P-1639,  Laboratory  for  Information  and  Decision  Systems,  MIT,  January 
1987;  accepted  for  presentation  at  the  8th  European  Workshop  on  Applications  and  Theory 
of  Petri  Nets,  Zaragoza  Spain,  24-27  June  1987. 


3  5.2  Computer  Aided  Evaluation  of  System  Architectures 


During  the  last  few  years,  a  number  of  problems  regarding  distributed  tactical  decisionmaking 
have  been  addressed  and  models,  algorithms,  and  methods  have  resulted  that  are  useful  for 
answering  specific  aspects  of  the  overall  problem.  In  order  to  integrate  these  results  into  a 
consistent  methodology  and  to  provide  the  means  for  designing  an  experimental  program,  a 
computer  aided  design  system  has  been  developed.  While  the  primary  support  for  this 
development  has  been  by  the  Basic  Research  Group  (BRG)  of  the  Technical  Panel  on  C3  of  the 
Joint  Directors  of  Laboratories,  there  has  been  sufficient  contribution  by  the  staff  of  this  project 


to  warrant  its  inclusion  in  this  report.  The  components  of  the  system  contributed  by  this  project 
are  identified  by  "DTDM  support"  in  the  detailed  description  that  follows. 

The  design  workstation  has  been  named  CAESAR  for  Computer  Aided  Evaluation  of  System 
ARchitectures.  It  consists  of  four  major  components: 

The  Architecture  Generator  which  constructs  feasible  organizational  forms  using  Petri 
Nets. 

The  Analysis  and  Evaluation  Module  which  contains  many  of  the  algorithms  for  the 
computation  of  the  Measures  of  Performance. 

A  Data  Base  which  is  used  to  store  the  results  of  the  analysis. 

The  Locus  Module  that  constructs  the  generalized  Performance  Workload  locus  of  an 
organization  and  can  be  used  to  evaluate  Measures  of  Effectiveness. 

The  structure  of  the  software  system  is  shown  in  Figure  1.  The  individual  modules  and  their 
status  are  described  below. 
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LIST  OF  MODULES  IN  CAESAR 


A.  ARCHITECTURE  GENERATOR 

DMO  GolAT  Program  that  generates  the  Petri  Nets  of  Decisionmaking 

Organizations  that  satisfy  a  set  of  structural  constraints, 
as  well  as  constraints  imposed  by  the  user.  The 
algorithm  is  based  on  P.  Remy's  thesis  (1986)  and  has 
been  implemented  in  DOS  3.0  ©  IBM,  using  Turbo 
Pascal  3.01  A  ©Borland  International  and  Screen 
Sculptor  ©Software  Bottling  Company. 

Status:  Program  operational  It  requires  an  interface  with 
DMO  Des.AT  so  that  a  graphical  description  of  the 
feasible  architectures  can  be  obtained  directly.  (DTDM 
support) 

DMO  DesAT  Interactive  graphics  program  for  the  construction  of  the 

Petri  Nets  of  arbitrary  organizational  architectures.  It  can 
be  used  to  create  and  store  subsytems  and  to  combine 
them  to  form  large  organizational  structures.  Program, 
developed  by  I.  Kyratzoglou,  also  creates  the  analytical 
description  of  the  Petri  Nets.  Implemented  in  DOS  3.0, 
Professional  Fortran,  Graphics  Tool  Kit,  and  Graphic 
Kernel  System,  all  ©IBM. 

Status:  To  be  completed  by  June  1  (JDL  support). 

DMO  Des.Mac  Interactive  graphics  program  for  the  construction  of  the 

Petri  Nets  of  arbitrary  organizations.  It  can  be  used  to 
design  organizations  of  arbitrary  size  through  the  use  of 
nested  subnets.  Program  developed  for  the  Apple 
Macintosh  by  the  Meta  Software  Corp.  using  the  Design 
Open  Architecture  System  ©Meta  Software  Corp.  The 
program  creates  the  analytical  description  of  the  Petri 
Net,  as  well  as  store  functions  and  attributes  represented 
by  the  transitions,  places,  and  connectors.  Program 
enhanced  by  J.  L.  Grevet  to  be  consistent  with  analytical 
description  of  Petri  Nets  used  in  various  algorithms. 
Status:  Program  operational.  (JDL  support) 

MacLInk  ©Dataviz  Commercial  software  for  for  converting  and  transmitting 

files  between  the  DOS  machines  and  the  Macintosh. 
Status:  MacLink  has  been  installed  and  is  operational:  it 
can  transfer  the  data  structure  of  a  Petri  Net  from  the 
DMO  Des.Mac  module  to  the  Analysis  and  Evaluation 
Module. 


Incidence  Matrix  /Attributes 


Standard  form  for  the  data  structure  of  Petri  Nets.  The 
files  contain  the  incidence  matrix  or  flow  matrix  of  the 
Petri  Net  and  the  attributes  and  functions  associated  with 
die  elements  of  the  net 

Status:  Standard  version  of  incidence  matrix  has  been 
implemented;  the  specifications  for  the  attribute  file  are 
being  developed.  Expected  completion  date:  July  1. 
(DTDM  support) 


B.  ANALYSIS  AND  EVALUATION  MODULE 

Matrix  Conversion  Simple  algorithm  that  transforms  the  incidence  matrix 

into  the  interconnection  matrix  used  in  Jin's  algorithm. 
Algorithm  in  Turbo  Pascal  3.01  A. 

Status:  Algorithm  developed  by  Jin  is  operational. 
(DTDM  support) 

Paths  Algorithm  developed  by  Jin  in  her  thesis  that  determines 

all  the  simple  paths  and  then  constructs  the  concurrent 
paths  in  an  organizational  architecture.  This  is  an 
efficient  algorithm  that  obtains  the  answers  by  scanning 
the  interconnection  matrix.  Algorithm  in  Turbo  Pascal 
3.01A. 

Status:  Program  is  operational.  (DTDM  support) 

Simple  algorithm  that  calculates  path  delays  and  expected 
delay  when  processing  delays  are  constant  Algorithm  in 
Turbo  Pascal  3.01A. 

Status:  Algorithm  is  operational.  (DTDM  support) 

Algorithm  developed  by  Andreadakis  that  calculates 
measures  of  timeliness  when  the  processing  delays  are 
described  by  beta  distributions.  It  also  accounts  for  the 
presence  of  jamming  and  its  effect  on  timeliness. 
Algorithm  in  Turbo  Pascal  3.01  A. 

Status:  Problem  specific  version  operational;  general 
version  to  be  completed  by  September  1.  (DTDM 
support) 

Res  Con  Algorithm  developed  by  Hillion  in  his  thesis  that 

calculates  the  maximun  throughput  in  a  Timed  Event 
Graph,  a  special  class  of  Petri  Nets.  It  also  detrmines  the 
optimal  schedule  in  the  presence  of  resource  and  time 
constraints.  The  procedure  incorporates  an  algorithm 
proposed  by  Martinez  and  Silva  for  determining  simple 
paths  through  the  calculation  of  s-invariants. 

Status:  Independent  version  of  algorithm  is  operational; 
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integrated  version  in  workstation  to  be  operational  by 
June  1.  (JDL  support) 


PWComp3 


PW  Comp  5 


C  DATA  BASE  MODULE 
LOCUS  Data  File 


D.  LOCUS  MODULE 
LOCUS 


ISO  Data 


Algorithm  for  the  computation  of  a  three-person 
organization's  performance  measure  J  (Accuracy)  and 
die  workload  of  each  one  of  the  decisionmakers.  The 
algorithm  computes  the  accuracy  of  the  response  and  the 
workload  for  each  admissible  decision  strategy.  This 
version  was  developed  by  Andreadakis  in  Turbo  Pascal. 
Status:  Program  is  operational.  (DTDM  support) 

A  variant  of  PW  Comp  3,  but  for  a  five-person 
organization  modeling  die  ship  control  party  of  a 
submarine.  Algorithm  developed  by  Weingaertner  as  part 
of  his  diesis.  Implemented  in  Turbo  Pascal. 

Status:  Program  is  operational. 


Data  file  in  which  the  results  from  the  evaluation  of  a 
decisionmaking  organization  are  stored.  The  file,  as 
currently  structured,  can  accommodate  five  measures  of 
performance  -  accuracy,  timeliness,  and  workload  for 
three  persons.  It  also  contains  four  indices  that  specify 
the  decision  strategy  associated  with  each  record. 

Status:  Three-person  organization  version  operational. 
General  structure  to  be  implemented  by  June  1.  (JDL 
support) 


Graphics  plotting  program  that  generates  two  or  three 
dimensional  loci  or  two-  and  three-dimensional 
projections  of  higher  dimensional  loci.  This  is  the  basic 
program  used  to  construct  the  Performance  -  Workload 
locus  of  an  organization.  Basic  version  developed  by 
Andreadakis  and  Bohner  and  described  in  latter's  thesis. 
Status:  Version  using  professional  graphics  controller  is 
operational.  Revised  transportable  version  adhering  to 
the  VDI  standard  and  with  improved  user  interface  is 
also  operational.  (DTDM  support) 

Algorithm  for  obtaining  some  measures  of  effectiveness 
from  the  measures  of  performance  stored  in  the  Locus 
Data  file.  Specifically,  it  finds  isoquants:  e.  gM  locus  of 
constant  accuracy,  or  constant  workload. 

Status:  New  version  for  microcomputers  being 


implemented  by  J.  Azzola  using  a  design  by 
Weingaertner.  Expected  date  of  completion  is  June  1 . 


E.  INPUT/ OUTPUT 

Output:  By  adopting  the  Virtual  Device  Interface  (VDI)  standard 

and  the  Enhanced  Graphics  standard,  it  is  possible  to 
develop  a  version  of  the  CAESAR  software  that  is 
transportable  to  other  IBM  PC  ATs  or  compatibles  and  to 
drive  a  wide  variety  of  output  devices:  various  monitors, 
printers,  laser  printers,  and  pen  plotters. 

Input:  A  uniform  user  interface  with  windowing  capability  is 

needed  to  make  the  system  useable  by  analysts  and 
designers.  Commercially  available  software  are  being 
investigated  to  select  die  most  appropriate  (me.  Expected 
completion  date  is  September  1. 


We  expect  to  have  the  transportable  version  of  CAESAR  operational  by  September  1, 1987 
and  demons tarate  it  at  the  next  annual  review  of  the  DTDM  program.  There  have  been  some 
delays  primarily  due  to  a  five  month  delay  in  obtaining  a  properly  configured  AT  compatible 
machine  that  serves  as  the  workstation. 


3  J  J  Design  of  E: 


A  major  application  of  CAESAR  is  in  the  design  and  analysis  of  experiments  in  which 
different  organizational  forms  will  be  evaluated.  At  this  time,  V.  Jin  has  initiated  a  project, 
under  the  supervision  of  Dr.  A.  H.  Levis,  in  which  she  is  assessing  the  applicability  of 
certain  methodologies  in  the  physical  sciences  for  the  design  of  experiments  to  the  behavioral 
sciences.  In  the  meantime,  with  funding  from  Joint  Directors  of  Laboratories,  an  experiment 
is  being  carried  out  to  determine  the  stability  of  the  bounded  rationality  constraint  and  to 
obtain,  if  possible,  values  for  it 
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DECENTRALIZED  DETECTION  BY  A  LARGE  NUMBER  OF  SENSORS1 

John  N.  Tsitsiklis* 


ABSTRACT 

We  consider  the  decentralized  detection  problem,  in  which  a  number  N  of  identical  sensors  transmit 
a  finite-valued  function  of  their  observations  to  a  fusion  center  which  then  decides  which  one  of  M 
alternative  hypotheses  is  true.  We  consider  the  case  where  the  number  of  sensors  tends  to  infinity. 
We  then  show  that  it  is  asymptotically  optimal  to  divide  the  sensors  into  M(M  — 1)/2  groups,  with 
all  sensors  in  each  group  using  the  same  decision  rule  in  deciding  what  to  transmit.  We  also  show 
how  the  optimal  number  of  sensors  in  each  group  may  be  determined  by  solving  a  mathematical 
programming  problem.  For  the  special  case  of  two  hypotheses  and  binary  messages  the  solution 
simplifies  considerably:  it  is  optimal  (asymptotically,  as  N  — ♦  oo)  to  have  each  sensor  perform  an 
identical  likelihood  ratio  test  and  the  optimal  threshold  is  very  easy  to  determine  numerically. 
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The  (static)  decentralized  detection  problem  is  defined  as  follows.  There  are  M  hypotheses 
H\,  •  •  • ,  Hm,  with  known  prior  probabilities  P(H,)  >  0  and  N  sensors.  Let  Y  be  a  set  endowed  with 
a  a-field  7  of  measurable  sets.  Let  y,-,  i  =  1, ...,  N,  the  observation  of  the  i-th  sensor,  be  a  random 
variable  taking  values  in  Y.  We  assume  that  the  y^’s  are  conditionally  independent  and  identically 
distributed,  given  either  hypothesis,  with  a  known  conditional  distribution  P(y\Hj),  j  —  1, 

Let  D  be  a  positive  integer.  Each  sensor  t  evaluates  a  D-valued  message  u<  6  (1  ,...,£>},  as  a 
function  of  its  own  observation;  that  is  u*  =  7,(y,),  where  the  function  7,-  :  Y  ■-»  (1, ...,  D}  is  the 
decision  rule  of  sensor  1  and  is  assumed  to  be  a  measurable  function.  The  messages  ui,...,uf/ 
are  all  transmitted  to  a  fusion  center  which  declares  one  of  the  hypotheses  to  be  true,  based  on  a 
decision  rule  70  :  {1, ...,  D}N  >-»  {1, ...,  Af}.  That  is,  the  final  decision  uo  of  the  fusion  center  is  given 
by  u0  =  7o(«ii  m.,ujv).  The  objective  is  to  choose  the  decision  rules  7o»7i»  •••>7tv  of  the  sensors 
and  the  fusion  center  so  as  to  minimize  the  probability  of  error  in  the  decision  of  the  fusion  center. 
(An  alternative  formulation  of  the  problem,  of  the  Neyman-Pearson  type  will  be  also  considered 
in  the  last  section.) 

The  above  defined  problem  and  its  variants  have  been  the  subject  of  a  fair  amount  of  recent 
research  [TS,  E,  TA,  LS],  especially  for  the  case  of  binary  hypotheses  (M  —  2)  and  binary  messages 
{D  =  2).  For  the  latter  case,  it  is  known  that  any  optimal  set  of  decision  rules  has  the  following 
structure.  Each  one  of  the  sensors  evaluates  its  message  u<  using  a  likelihood  ratio  test  with  an 
appropriate  threshold.  Then,  the  fusion  center  makes  its  decision  by  performing  a  final  likelihood 
ratio  test.  (Here,  the  messages  received  by  the  center  play  the  role  of  its  observations.)  Without 
the  conditional  independence  assumption  we  introduced,  this  result  fails  to  hold  and  the  problem  is 
intractable  (NP-hard),  even  for  the  case  of  two  sensors  [TA].  Assuming  conditional  independence, 
the  optimal  value  of  the  threshold  of  each  sensor  may  be  obtained  by  finding  all  solutions  of  a  set  of 
coupled  algebraic  equations  (which  are  the  person-to-person  optimality  conditions  for  this  problem) 
and  by  selecting  the  solution  which  results  to  least  cost.  Unfortunately  (and  contrary  to  intuition), 
even  if  the  observations  of  each  sensor  are  identically  distributed  (given  either  hypothesis)  it  is 


not  true  that  all  sensors  should  use  the  same  threshold  (see  the  Appendix  for  an  example).  This 
renders  the  computation  of  the  optimal  thresholds  intractable,  when  the  number  of  sensors  is  large. 
To  justify  this  last  claim,  consider  what  is  involved  in  just  evaluating  the  cost  associated  to  a  fixed 
set  7oi7i>  of  decision  rules  if  each  sensor  uses  a  different  threshold.  In  order  to  evaluate 

the  expected  cost,  we  have  to  perform  a  summation  over  all  possible  values  of  (ux . un),  which 

means  that  there  are  2N  terms  to  be  summed.  (This  is  in  contrast  to  the  case  of  equal  thresholds 
in  which  the  u,-’s  are  identically  distributed  and  therefore  the  binomial  formula  may  be  used  to 
obtain  a  sum  with  only  IV  + 1  summands.)  Of  course,  to  determine  an  optimal  set  of  decision  rules 
this  effort  may  have  to  be  repeated  a  number  of  times.  This  suggests  that  the  computational  effort 
grows  exponentially  with  the  number  N  of  sensors. 

The  above  discussion  motivates  the  main  results  of  this  paper  which  show  that,  for  the  case 
M  =  2,  D  =  2,  it  is  asymptotically  optimal  to  have  each  sensor  use  the  same  threshold  and 
provides  a  simple  method  for  computing  the  optimal  threshold.  For  the  general  case  of  Af  >  2 
hypotheses,  it  is  no  longer  true,  not  even  in  the  limit  as  A  -*  oo,  that  each  sensor  should  use 
the  same  decision  rule.  Nevertheless,  we  show  that,  as  N  — ►  oo,  at  most  M(M  —  l)/2  different 
decision  rules  need  to  be  used  by  the  sensors.  The  determination  of  an  asymptotically  optimal  set 
of  decision  rules  is  still  a  hard  computational  problem,  except  for  the  case  where  the  observation 
set  y  is  finite  and  of  small  cardinality. 

Notation:  Throughout,  Pi  will  stand  for  the  (conditional)  measure  P(-\Hi)  on  {Y,T),  under 
hypothesis  Hi.  Furthermore,  £,[*]  will  stand  for  expectation  with  respect  to  the  measure  Pi. 

II.  THE  BAYESIAN  PROBLEM. 

We  start  by  noticing  that,  having  fixed  the  decision  rules  7x,...,7xv  of  the  sensors,  the  optimal 
decision  for  the  fusion  center  is  determined  by  using  the  maximum  a  posteriori  probability  (MAP) 
rule.  (The  messages  to  the  fusion  center  are  to  be  thought  as  measurements  available  to  it.)  Thus, 
7o  is  straightforward  to  determine  in  terms  of  71,...,  'in-  For  this  reason,  we  shall  be  concerned  only 
with  the  optimization  with  respect  to  (71,... ,7^).  Any  such  set  of  decison  rules  will  be  denoted, 


for  convenience,  by  fiN. 

We  introduce  some  more  notation.  Let  T  be  a  set  of  decision  rules  among  which  the  decision 
rules  of  each  sensor  are  to  be  selected.  In  general,  we  should  take  T  to  be  the  set  of  all  (measurable) 
functions  from  Y  into  the  set  {1, ...,£}.  However,  we  may,  for  some  reason,  wish  to  restrict  to  a 
smaller  class  of  decision  rules,  possibly  having  some  simplifying  structure.  We  return  to  this  issue 
in  Section  HI.  Let  VN  be  the  Cartesian  product  of  T  with  itself,  N  times.  For  any  7 N  €TN ,  let 
J*(7*)  be  the  probability  of  an  erroneous  final  decision  by  the  fusion  center  (always  assuming 
that  the  fusion  center  uses  the  MAP  rule).  We  are  concerned  with  the  minimization  of  Jpf(-fN), 
over  all  ~iN  €.TN ,  when  N  is  very  large. 

It  is  easy  to  show  that,  as  the  number  of  sensors  grows  to  infinity,  the  probability  of  error  goes 
to  zero,  for  any  reasonable  set  of  decision  rules,  in  fact  exponentially  fast.  Consequently,  we  need 
a  more  refined  way  of  comparing  different  sets  of  decision  rules,  as  JV  — ►  00.  To  this  effect,  for 
any  given  value  of  IV  and  any  set  7*  of  decision  rules  for  the  AT-sensor  problem,  we  consider  the 
exponent  of  the  error  probability  defined  by 

M-r")  = 

Let  Rff  =  inf7Ar€rw  r^( 7N)  be  the  optimal  exponent.  Let  Tq  be  the  set  of  all  7^  e  TN  with 

the  property  that  the  set  {7x . 7jv)  has  at  most  M[M  —  l)/2  different  elements.  Let  Qs  = 

inf -T^er^  rJv('7JV)  be  the  optimal  exponent,  when  we  restrict  to  sets  of  decision  rules  in  Tq.  The 
following  result  shows  that,  asymptotically,  optimality  is  not  lost,  if  we  restrict  to  rtf. 

Theorem  X:  Subject  to  Assumption  1  below,  lim n—oo{Qn  —  R/f)  —  0. 

The  rest  of  this  section  is  devoted  to  the  proof  of  Theorem  1.  We  first  need  to  introduce  some 
auxiliary  tools. 

Let  us  fix  some  7  6  T.  The  mapping  from  the  true  hypothesis  Hi  to  the  decision  of  a  sensor 
employing  the  decision  rule  7  may  be  thought  of  as  a  noisy  channel  which  is  completely  described 
by  the  probabilities 

P,7(<0  =  *(7(V)  =  d). 
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The  ability  of  such  a  channel  to  discriminate  between  hypotheses  Hi  and  Hj  (t  ^  j)  may  be 
quantified  by  a  function  ptJ(7,  a),  a  €  [0, 1],  defined  by  the  following  formula  [SGB]: 

=  ftfMI'-frJWI'  .  (1) 

,d=l 

We  use  here  the  convention  0°  =  0;  thus,  the  summation  in  (1)  is  to  be  performed  only  over  those 
d's  for  which  p? (d)pl[ (d)  ^  0.  Assuming  that  ^(7,  a)  is  not  infinite,  it  is  easy  to  see  that  Mtj(7,»), 
is  infinitely  differentiable,  as  a  function  of  a,  and  its  derivatives  are  continuous  on  [0, 1],  provided 
that  we  define  the  derivative  at  an  endpoint  as  the  limit  when  we  approach  the  endpoint  from  the 
interior. 

Notice  that,  for  any  fixed  7,  the  function  p»y(7,a)  is  equal  to  E[c*x\,  where  X  is  the  log- 
likelihood  ratio  of  the  distributions  pj( •)  and  p7(*)>  where  the  expectation  is  with  respect  to  the 
distribution  p?(-)-  As  is  well-known,  minimizing  the  characteristic  function  of  a  random  variable 
X  yields  tight  bounds  on  the  probability  of  large  deviations  of  X  from  its  mean.  Since  in  this  case 
X  is  the  log-likelihood  ratio,  this  method  leads  to  tight  bounds  on  the  probability  of  error.  One 
particular  such  result  that  we  will  use  is  taken  from  [SGB]: 

Lemma  1:  Let  there  be  two  hypotheses  H'  and  H".  Let  Xx,  ...,xjy  be  measurements  taking  values 
in  a  finite  set  {1, ...,  D },  which  are  conditionally  independent  given  the  true  hypothesis  and  suppose 
that  the  conditional  distribution  of  z<,  when  H  is  true,  is  described  by  Pjj{d)  =  P(x<  =  d\H).  Let 

p(«,a)  =  log 

.d—l 

and  n{a)  =  Eili  m(».  *)•  Assume  that  p(t, a),  /i'(«,s),  p"(i,a)  exist  and  are  finite,  where  a  prime 
stands  for  differentiation  with  respect  to  a.  Let  s*  minimize  p(a),  over  a  €  [0, 1].  Then, 

a)  There  exists  a  decision  rule  for  deciding  between  H'  and  H",  on  the  basis  of  the  measurements 
xx,...,xjv,  for  which 

P(decide  H'  \  H"  is  true)  +  P(decide  H"\  H'  is  true)  <  2exp{/i(s*)}. 

b)  For  any  rule  for  deciding  between  H'  and  H",  on  the  basis  of  the  measurements  ix,...,xjv,  we 
have 

P(decide  H'  |  H"  is  true)  +  P(decide  H"  \  H'  is  true)  >  i  exp{p(a*)  -  [2/i,,(«*)]1^1}, 

it 


where  a  prime  indicates  differentiation  with  respect  to  «. 

Proof:  Part  (a)  of  the  Lemma  is  the  Corollary  in  p.84  of  [SGB].  For  part  (b),  it  is  shown  in  [SGB] 
(equation  (3.42),  p.87)  that 


P(decide  H'  \  H"  is  true)  +  P(decide  H"  |  H'  is  true)  > 

\  exp{/x(«)  -  «m'(«)  -  «[2p"(«)]1/J}  +  ~  exp{n(a)  +  (1  -  s)m'(s)  -  (1  -  «)[2p"(«)]1/a},  Vs  6  (0, 1). 

If  8*  E  (0,1),  we  have  p'(s*)  —  0  and  the  desired  result  follows  immediately.  If  a*  =  0,  we  may 
take  the  limit  in  the  above  inequality,  as  a  J  0.  Since  n"  is  continuous,  and  therefore  bounded,  we 
have  lim.jo  8fi"(a)  =  0,  which  yields 

P(decide  H'\  H"  is  true)  +  P(decide  H"  |  H'  is  true)  >  ^exp{/i(0)}  >  exp{/i(0)  -  [2/x,/(0*)J . 

m 

The  last  inequality  follows  because  n  is  convex  and  therefore  fi"{a)  >  0,  V«.  The  argument  for  the 
case  a*  =  1  is  identical.  • 


The  bounds  of  parts  (a)  and  (b)  of  the  Lemma  could  be  far  apart  if  ji"  is  left  uncontrolled.  For 
this  reason  we  introduce  the  following  assumption: 

Assumption  1:  a)  |m<3(7»*)I  <  oo,  V7  €  T,  Vi  ^  j,  Vs  €  [0, 1). 

b)  There  exists  a  constant  A  such  that  |m”3  (7>*)I  <  A,  Vs  €  [0, 1],  V7  €  T,  Vi  ^  j. 

The  content  of  this  Assumption  is  explored  in  Section  VI;  it  is  shown  there  that  it  corresponds 
to  some  minor  restrictions  on  the  distribution  of  the  observations,  which  are  satisfied  in  typical 
situations  of  practical  interest. 

As  a  preview  of  the  remainder  of  the  proof,  we  use  Lemma  1,  for  each  pair  of  distinct  hypotheses 
to  argue  that  the  decision  rules  71,  ...,7 jy  of  the  sensors  should  be  chosen  so  as  to  minimize 


max 


N 


min 

•6[0,1] 


fc=i 


We  reformulate  this  as  a  linear  programming  problem  and  use  linear  programming  theory  to  show 
that  a  small  number  of  different  7fc’s  suffices. 


Let  7  be  the  set  of  all  finite  subsets  of  I*.  For  any  F  €  7,  let 


A(F)  =  min  max  min  £  »), 

*t  »*J>  •€[0,1)^, 

where  the  minimization  with  respect  to  xn  is  subject  to  the  constraints 

*-r  -  0,  V7€P,  (2a) 

£  ”  L  (2fc) 

ter 

Let 

A*  =  inf  A(F). 

Let  us  fix  IV  and  some  collection  ~tN  €  r*  of  decision  rules.  Let  a  =  min^  We  then  have, 

using  part  (b)  of  Lemma  1, 

•M7*)=  £  P(decideHi|Hj)P(Hj)> 

»'#J> 


2  [£**»■•«)  -  (2E <(-'»■*:,))  j  ■ 

where  *Jy  minimizes  £*Li  M»'j(7k>*)  over  «  €  [0,1].  Let  F  be  the  set  of  different  decision  rules 
(elements  of  T)  which  are  present  in  the  collection  7*  of  decision  rules.  For  each  7  €  F,  let  x^  be 
the  proportion  of  the  sensors  using  decision  rule  7;  that  is  xn  is  equal  to  the  number  of  k’a  such 
that  7*  =  7,  divided  by  N.  By  construction,  the  coefficients  x7  satisfy  the  constraints  (2a-2b). 
Using  Assumption  lb  to  bound  M</(7fe>  *Jy)>  the  definition  of  and  the  definition  of  A(F),  we  have 

Jiv(7W)  >  ?exp  [  max  min  W  £  x~/ii,  (7, «)  -  (2 N A) 1/1  I  > 

wy  2  F  »/y}  »€(o,i]  ’  V  J 

“  Nk{r)-(lNA)x/i  >  ®  Nk'-(iNA)lt* 

2  “  2 

This  shows  that  >  A*  -  (2 A/N)1^2  +  jj  log(a/2).  Taking  the  limit  as  N  -»  00,  we  obtain 


lim  inf  f?jv  >  A*. 

N— »oo 
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L«mnm  2:  A*  =  inff  £*  A(f)>  where  To  is  the  collection  of  all  subsets  of  T  of  cardinality  no  larger 
than  M(M  -  l)/2. 

Proof:  Given  some  F  6  7,  let  sjy,  x*  be  such  that  the  constraints  (2a),  (2b)  are  satisfied  and 

(Such  sJJ  t  x*  exist  because  the  quantity  max{(,(y);  “  continuous  in  *«•/, 

*7  and  is  defined  over  a  compact  set;  therefore,  the  minimum  arising  in  the  definition  of  A (F) 
is  attained.)  In  particular,  if  the  *;/s  are  fixed,  then  the  i^’s  are  determined  by  minimizing 
ma w}  E7er  «•;),  subject  to  the  constraints  (2a)-(2b).  This  minimization  is  equiv¬ 

alent  to  the  following  linear  programming  problem: 

min  A 


subject  to 

A  >  53  x^j(y, •<,),  V»,y,  t-  i, 

x7  >  0,  V7  €  F, 

53  *7  =  L 

-ter 

Let  T  be  the  cardinality  of  the  set  F.  The  above  defined  linear  program  has  T  +  1  variables  and 
7+l+Af(Af- 1)/2  constraints.  From  linear  programming  theory  [PS],  we  know  that  there  exists  an 
optimal  solution  at  which  the  number  of  constraints  for  which  equality  holds,  is  no  smaller  than  the 
number  of  variables.  Therefore,  with  this  optimal  solution  at  most  M{M  —  l)/2  of  the  constraints 
hold  with  a  strict  inequality,  which  implies  that  at  most  M(M  —  l)/2  of  the  x7’s  are  nonzero. 
Therefore,  for  any  F  €  T  there  exists  some  F1  6  To  ®uch  that  A (F')  <  A(F)  This  completes  the 
proof  of  Lemma  2.  • 

Let  us  fix  some  N  and  some  e  >  0.  Let  F  be  a  subset  of  T  of  cardinality  no  larger  than 
Af(Af  -  l)/2  (that  is,  F  €  To),  such  that  A(F)  <  A*  +  e,  which  exists,  because  of  Lemma  2.  Let 
x*,  and  *v  be  such  that 

...max  53  *W(7»*o)  =  A(F)  <  A*  +  c. 
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We  now  define  a  collection  ~[N  of  decision  rules  to  be  used  by  the  N  sensors:  for  each  7  €  F,  we 
let  exactly  [JVz*  J  of  them  use  the  decision  rule  7;  if  there  are  any  remaining  sensors,  which  is  the 
case  if  JVz*  is  not  an  integer  for  some  7,  we  let  these  sensors  use  an  arbitrary  decision  rule  out  of 
the  set  F.  Let  JVo  be  the  number  of  these  remaining  sensors. 

We  now  estimate  the  probability  of  error  under  this  particular  7*.  The  probability  of  error  is 
bounded  above  by  the  probability  of  error  for  the  case  where  the  fusion  center  chooses  to  ignore 
the  messages  transmitted  by  the  last  No  sensors  and  this  is  what  we  will  assume.  We  now  have 

/’(decide  Hi  \  Hj  is  true)P(ZT,)  < 

»¥j) 

M 3  max  [P(decide  Hi  |  Hj  is  true)  +  P(decide  Hj  \  Hi  is  true)].  (4) 

The  expression  inside  the  brackets  in  the  right  hand  side  of  (4)  refers  to  the  probabilities  of  error 
for  a  context  in  which  Hi  and  Hj  are  the  only  hypotheses.  Since  the  fusion  center  uses  the  MAP 
rule,  it  is  using  a  decision  rule  which  would  be  optimal  even  if  it  had  to  discriminate  only  between 
the  two  hypotheses  Hi  and  Hj  (always  assuming  that  the  last  No  messages  are  ignored).  Thus,  for 
each  pair  of  hypotheses,  the  upper  bound  on  the  probability  of  error  furnished  by  Lemma  1(a)  is 
applicable.  This  yields 


•M7*)  <  2Af*  max  exp 


i*r 


(5) 


We  now  use  the  inequality  JVz*  —  [JVz*  J  <  1  to  obtain 

£L**;K(7,.;y)  <  53  NxnHji~t>aij) + 53  iw,(7,<,)i  ^  53 

'»€#•  -ter  -ter  -ter 

where  if  is  a  constant  independent  of  N.  We  substitute  the  above  inequality  in  the  right  hand  side 

of  (5),  then  take  logarithms  and  divide  by  N  to  obtain 

n  ^  log-M7*)  ^  2  log  M  ,  log  K  ,  •  (  .  n  ^  ,  ,  K> 

Qn  ~  N  ~  N  +  N  +  - A  +€  +  N' 


ner 


where  K 1  is  another  constant  independent  of  N.  We  take  the  limit  as  N  — ►  oo  and  use  the  fact 
that  e  was  arbitrary  to  conclude  that  limsup^.,^  Qs  <  A*.  We  combine  this  inequality  with  (3) 
and  the  obvious  inequality  Rn  <  Qn  to  complete  the  proof  of  the  theorem.  • 


Let  us  start  by  stressing  that  the  proof  of  Theorem  1  is  constructive  and  suggests  a  procedure 
for  determining  an  asymptotically  optimal  set  of  decision  rules.  Namely,  we  have  to  solve  the 
optimization  problem  defining  A*.  The  value  of  A*  is  the  optimal  exponent  and  the  associated 
optimal  values  of  the  x^'a  are  the  proportions  of  the  sensors  who  should  use  each  decision  rule  7. 

Theorem  1  is  most  useful  in  the  case  of  binary  hypotheses  (M  =  2)  and  binary  messages  (D  =  2). 
For  that  case  it  is  known  [TS]  that,  without  any  loss  of  optimality,  we  may  assume  that  each  sensor 
decides  what  to  transmit  by  performing  a  likelihood  ratio  test,  with  an  appropriate  threshold.  We 
thus  let  T  be  the  set  of  all  such  decision  rules.  Furthermore,  in  this  case  we  have  M{M  -  l)/2  =  1 
and  Theorem  1  implies  that  it  is  asymptotically  optimal  to  let  every  sensor  use  the  same  threshold. 
In  order  to  compute  A*  we  only  need  to  optimize  over  all  subsets  of  T  of  cardinality  1.  Therefore, 
the  optimal  threshold  may  be  computed  by  solving  the  optimization  problem 


min  min  Uuh,*)- 
•»€r  *eio,ij  v  ' 


(6) 


Notice  that  each  7  €  T  can  be  described  by  a  single  real  number,  the  value  of  the  threshold  being 
employed.  We  are  therefore  dealing  with  a  nonlinear  optimization  problem  in  two  dimensions.  In 
typical  problems,  the  probabilities  p?(d)  are  given  by  simple  analytical  expressions,  as  a  function 
of  the  threshold  corresponding  to  7.  Therefore,  simple  analytical  expressions  are  also  available  for 
Mia(7»«)  *s  well.  It  is  known  that  Mxi(7>«)  i»  &  convex  function  of  s,  for  every  7  [SGB],  which 
makes  the  optimization  with  respect  to  «  easier.  Unfortunately,  we  are  not  aware  of  any  simple  but 
nontrivial  examples  in  which  the  solution  of  the  above  optimization  problem  and  the  corresponding 
value  of  the  optimal  threshold  may  be  obtained  analytically. 

In  the  case  of  binary  hypotheses  (Af  =  2)  and  messages  of  arbitrary  cardinality  D  >  2,  it  is 
known  that  likelihood  ratio  tests  are  again  optimal  except  that  each  decision  rule  consists  of  D  -  1 
thresholds  which  determine  which  one  of  the  D  messages  is  to  be  sent.  The  same  discussion  as 
for  the  case  of  D  =  2  applies  here  and  (asymptotically)  each  sensor  should  use  the  same  set  of 
thresholds.  The  only  difference  is  that  7  is  parametrized  by  a  (D  -  1) -dimensional  real  vector  (as 
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opposed  to  a  scalar).  Thus,  the  problem  (6),  which  needs  to  be  solved  in  order  to  determine  the 
optimal  thresholds,  is  a  17-dimensional  optimization  problem.  This  may  become  quite  hard  unless 
D  is  small,  the  reason  being  that,  in  general,  41(7,  s)  is  not  a  convex  function  of  the  parameters 
specifying  7. 

For  the  case  where  M  >  2,  Theorem  1  is  less  useful  for  computing  an  asymptotically  optimal  set 
of  decision  rules.  The  reason  is  that  we  have  to  perform  an  optimization  problem  over  all  subsets 
of  T  of  cardinality  M(M  -  l)/2.  In  principle,  it  seems  possible  to  reformulate  the  optimization 
problem  defining  A*  in  a  way  that  avoids  having  to  consider  each  such  subset  of  T  (which  would 
be  impossible  anyway  if  T  is  infinite).  Namely,  we  might  perform  the  minimization 


min  max  min 

{(*.})■  •*/}  *€(0.1) 


L 


M»j(7,*)d*(7), 


where  *(•)  is  a  positive  measure  on  T  with  x(r)  =  1  and  where  P  is  the  set  of  all  such  measures. 
Leaving  aside  the  technical  difficulties  in  showing  that  this  is  an  equivalent  problem,  it  still  does 
not  seem  particularly  promising  from  a  computational  point  of  view.  It  appears  that  the  only  cases 
in  which  a  numerical  solution  is  possible  are  those  cases  in  which  the  set  Y  is  finite  and  has  small 
cardinality,  because  in  that  case  T  is  also  finite  and  has  small  cardinality.  Notice  that  if  Fi  C  Fa, 
then  A(-Fj)  <  A  (Fi).  Therefore,  if  T  is  finite,  we  have  A*  =  A(r).  This  suggests  that  in  order 
to  compute  A*  it  is  preferable  to  ignore  Theorem  1:  instead  of  computing  A (F)  for  each  F  of 
cardinality  M{M  -  l)/2,  and  then  taking  the  minimum,  we  may  just  compute  A(r). 

An  Example:  Let  Af  =  3,  D  =  2  and  let  Y  =  (1,2,3).  Let  each  hypothesis  be  equally  likely 
and  let  the  statistics  of  the  observation  y  be  as  follows:  conditioned  on  Hi  being  true,  y  takes  the 
value  *  with  probability  1  -  2c  and  takes  each  one  of  the  remaining  two  values  with  probability  c 
(0  <  €  <  1/4).  There  are  three  possible  decision  rules.  The  :-th  possible  decision  rule  is:  7<(y)  =  1  if 
and  only  if  y  =  1.  Notice  that  71  does  not  provide  any  information  useful  in  discriminating  between 
Hi  and  H%.  Thus,  Mis(7ii  *)  —  0,  Vs;  similarly,  Mia(7s>4)  =  Mis(7a»*)  =  0,  Vs.  Furthermore, 
by  symmetry,  Mia(7i>*)  —  Mis(7i»«)  —  Mas  (72 » *)>  etc.  Let  a  be  the  value  of  the  minimum  of 
Mu (71,*),  over  s  €  [0,1].  Let  be  the  proportion  of  sensors  using  7,-.  The  optimal  values  of 
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xi,xitx»  are  determined  by  solving  the  problem 


a  max  {*1  +  *i,*i  +  *s,*j  +  *«}, 

*».*».*» 

over  the  unit  simplex.  It  is  easy  to  see  that  the  optimal  solution  is  x\  =  %\  =  *j  =  J,  exactly  as 
expected  from  the  symmetry  of  the  problem,  and  the  corresponding  value  of  the  optimal  exponent 
A*  is  2a/3. 

IV.  ALTERNATIVE  INTERPRETATIONS. 

Theorem  1  may  be  restated  in  a  different  language  refering  to  a  different  context.  For  simplicity, 
we  only  consider  the  case  M  —  2.  Suppose  that  we  want  to  transmit  a  binary  message  and  that  we 
have  a  collection  of  noisy,  memoryless  and  independent  channels  in  our  disposal.  We  are  allowed 
to  transmit  a  total  of  N  times  using  any  of  the  available  channels  each  time.  A  receiver  observes 
the  N  outputs  of  the  channels,  uses  its  knowledge  of  which  channels  were  being  used,  and  makes  a 
decision  on  what  was  transmitted.  The  problem  consists  of  finding  which  channels  should  be  used 
and  how  many  times  each,  in  order  to  maximise  the  probability  of  correct  decoding.  For  small 
N,  it  may  be  better  to  use  a  different  channel  each  time,  even  if  the  original  message  is  binary. 
However,  our  result  states  that,  for  binary  messages,  as  N  —*  oo,  there  is  a  single  best  channel 
which  should  be  used  for  all  transmissions.  To  see  the  analogy,  think  of  the  hypothesis  Hi  or  JTj  as 
the  value  of  the  binary  message  which  we  want  to  transmit  and  think  of  Ui  as  the  output  of  the  i-th 
transmission.  A  different  channel  corresponds  to  a  different  decision  rule  and  the  characteristics  of 
the  channel  correspond  to  the  quantities  p?(d). 

A  different  analogy  may  be  made  in  the  context  of  optimal  design  of  measurements  for  failure 
detection.  Suppose  that  we  have  a  system  which  may  be  in  one  of  two  states:  up  or  down.  We 
have  a  collection  of  devices  which  may  be  used  for  failure  detection.  They  are,  however,  unreliable 
and  may  make  errors  of  both  types.  Furthermore,  the  probabilities  of  either  type  of  error  can  be 
different  for  different  devices.  Suppose  that,  in  order  to  increase  reliability  we  want  to  use  N  such 
devices.  Then,  our  result  states  that,  as  N  — »  oo,  there  exists  a  single  best  device  and  that  we 
should  use  N  replicas  of  it,  rather  than  using  many  devices  with  different  characteristics. 


In  this  section  we  explore  Assumption  1.  Our  objective  here  is  to  obtain  conditions  on  the 
distributions  P*  under  which  Assumption  1  can  be  shown  to  hold.  Proposition  1  below  deals  with 


Assumption  1(a). 

Proposition  1:  Assumption  1(a)  fails  to  hold  if  and  only  if  there  are  two  hypotheses  Hi,  Hj,  such 
that  the  corresponding  measures  Pi  and  Py  are  mutually  singular.* 

Proof:  Suppose  that  Assumption  1(a)  fails.  Then,  there  exist  some  t,  j  and  some  7  €  T  for  which 
Pi(d)p](d)  =  0,  Vd  €  {1,...,D}.  Thus,  for  any  d  €  (1  ,...,D),  the  set  {y  €  Y  :  7 (y)  =  d)  has 
non-zero  measure  under  Pi  only  if  it  has  zero  measure  under  Pj.  Since  the  sets  {y  6  Y  :  7(y)  =  d) 
cover  the  entire  set  Y,  it  follows  that  P,  and  Pj  are  mutually  singular.  • 


As  a  consequence  of  Proposition  1,  we  can  see  that  if  there  are  only  two  hypotheses  and  As¬ 
sumption  1(a)  fails  to  hold  we  are  dealing  with  the  uninteresting  situation  where  each  sensor  is  able 
to  determine  the  true  hypothesis  on  its  own,  with  zero  probability  of  error.  For  the  case  of  more 
than  two  hypotheses,  however,  there  are  nontrivial  detection  problems  in  which  Assumption  la 
fails  to  hold.  We  conjecture  that  a  somewhat  modified  version  of  Theorem  1,  covering  such  a  case, 
is  possible.  We  now  explore  Assumption  1(b)  and  show  that  it  holds  for  two  interesting  situations. 
Proposition  2:  Suppose  that  the  observation  set  Y  is  finite  and  that  Assumption  1(a)  holds. 
Then  Assumption  1(b)  also  holds. 

Proof:  The  derivatives  of  Py(7,«),  with  respect  to  »  are  easily  calculated  to  be  [SGB,  equations 
(3.24)-(3.25)]: 


, ,  ,  ^  ,  pX< 0 

(7,  •)  =  2^  lo*  — 


t,  Ef=1(p?(c))i-(p?(c))«  v«r 


My  (7,*)  = 


A  WWW)*  L 
hx  Ef-tWW)l-(pJ(e))*  v og  W/ 


-  K(7,«)]*, 


(7) 


(8) 


where  all  summations  are  made  over  those  c’s  and  d’s  for  which  pj  (c)pj (c) ,  (respectively,  p?  [d)p] (d)  )| 
is  nonzero. 


*  Two  positive  measures  Pi,  Pj,  defined  on  a  common  (measurable)  space  Y  are  called  mutually 
singular  if  there  exists  a  measurable  subset  U  of  Y  such  that  P,  (17)  =  P,(V  -  U)  =  0. 


Let  a  be  the  minimum  of  p?(c),  where  the  minimum  is  taken  over  all  choices  of  7,  e,  t,  such  that 
p7(c )  >  0.  Since  Y  is  finite,  the  set  of  all  possible  decision  rules  7  is  also  finite  and  therefore  a 
is  the  minimum  of  finitely  many  positive  quantities  and  is  itself  positive.  By  Assumption  1(a)  the 
denominator  in  equation  (7)  must  have  a  nonsero  summand  and  this  summand  will  be  bounded 
below  by  al~'a*  =  a.  The  numerator  is  bounded  by  D.  Concerning  the  logarithmic  term,  it  is 
bounded,  in  absolute  value,  by  j  logo),  for  any  d  in  the  range  of  the  summation.  We  conclude  that 
p<y(7,s)  is  bounded  in  absolute  value  by  a  constant  independent  of  t,  j,  7,  s.  A  similar  argument 
applies  to  p,/(7,  s)  and  concludes  the  proof.  • 

Proposition  3:  Suppose  that,  for  any  1,  j,  the  measure  is  absolutely  continuous  with  respect 
to  Pj  and  let  denote  the  Radon-Nikodym  derivative  dPi/dPj.  Assume  that 

^[log*  Li,}  <  00,  Vt,jf.  (9) 

Then  Assumption  1  holds. 

Proof:  The  fact  that  Assumption  1(a)  holds  is  immediate  from  our  assumption  of  absolute  conti¬ 
nuity  and  Proposition  1. 

For  any  decision  rule  7  :Y  (1,  let  71  be  the  smallest  o-field  contained  in  7  with 

respect  to  which  the  function  7  is  measurable.  Let  P7  denote  the  restriction  of  the  measure  Pi  on 
the  o-field  7"1.  It  follows  from  the  absolute  continuity  assumption  that  P?  is  absolutely  continuous 
with  respect  to  Pj.  We  define  L7  to  be  equal  to  the  Radon-Nikodym  derivative  dP?/dP7.  As  is 
well  known 

qj  =  E,\Lii\r<),  »*.(/>,).  (10) 

Consider  the  function  ^  :  (0, 00)  •-*  (0, 00)  defined  by  ^(t)  =  t  log2 1.  An  easy  calculation  shows 
that  it  is  convex.  Therefore,  using  (10)  and  Jensen’s  inequality, 

Et[ log2  L7]  =  E,[Ll  log2  L7.]  =  £y[*(L7.)]  =  Ej[4>(Ej[Lij  \  T'})]  < 

E,[E}[<!>{Li}- 1  r»])  =  E,{Li,  log2  La)  =  £,[log2  Liy]. 

Using  (9),  we  conclude  that  there  exists  a  constant  B  <  00  such  that  J?([log2  £7.]  <  B,  V~i,i , j; 
using  the  inequality  £[|z|]  <  1  +  £[z2],  we  obtain  the  same  conclusion  for  £,[log  L7.]. 
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Notice  now  that  L?-(y)  =  p-(d)/p](d),  for  every  y  such  that  7(y)  =  d,  almost  surely.  Using  this 
observation,  equation  (7)  may  be  rewritten  as 


^)-Io«LT) 

J  mw  * 


(ii) 


similarly,  equation  (8)  becomes 


(12) 


II 


,,,  ^l(^)*V^]  . . 

M»,( 7.*)—  £?4(LyJ*]  Im«/(7»*)]  • 

Using  the  obvious  inequality  (L7J*  <  (1  +  L7),  Vs  €  [0, 1],  we  obtain  the  bound 

.  W°«£i.-]l  +  |^[^log£7.]|  |£,[logLl]|  +  mair 

-  mm  "  wm 

We  have  already  proved  that  the  numerator  is  bounded.  We  now  establish  a  lower  bound  on 
2?.'[(£7J*].  Since  £,[£;,]  =  1,  it  follows  that  there  exists  a  /-measurable  set  Y0  C  Y  and  some 
e  >  0,  S  >  0,  such  that  Pi(Y0 )  >  e  and  Ly»(y)  >  5,  Vy  €  y0-  Since  *•  >  min{l,x},  we  obtain 
Ei\L*^\  >  f  min{l,5},  Vs  €  [0, 1].  We  now  use  the  fact  that  the  function  ^(x)  =  x*  is  concave,  for 
any  fixed  s  €  [0, 1],  and  Jensen’s  inequality  to  obtain 

MW  =  *[(*(!*  I  nr\  >  *[Bt[Lji  |  ru  =  EtlLr]  >  emin{l ,5}. 

This  concludes  the  proof  that  p'(7,»)  is  bounded.  The  proof  of  the  boundedness  of  p"(7 »*)  is 
identical  and  is  omitted.  • 


VI.  THE  NEYMAN-PEARSON  PROBLEM. 

In  this  section  we  consider  the  Neyman-Pearson  version  of  the  problem  studied  in  the  preceding 
sections.  We  are  given  an  observation  set  Y,  endowed  with  a  o-field  7 .  There  are  two  hypotheses 
(Af  =  2)  and  for  each  hypothesis  we  are  given  a  measure  Pi  on  (Y,7),  i  =  1,2.  Let  £  be  a  fixed 
positive  integer  and  let  T  be  the  set  of  all  measurable  functions  7  :Y  As  before,  the 

t-th  sensor  makes  an  independent  observation  y<  whose  statistics  are  described  by  Py,  assuming 
that  hypothesis  H:  is  true.  Again,  the  «-th  sensor  transmits  a  message  7«(y<)  to  a  fusion  center, 
where  7,  €  T,  and  finally  the  fusion  center  makes  a  final  decision  using  a  decision  rule  70.  We  allow 


70  to  be  randomised.  That  is,  the  final  decision  of  the  fusion  center  may  depend  on  the  messages  it 
has  received  as  well  as  an  internally  generated  random  variable.  Let  To  be  the  set  of  all  candidate 
decision  rules  70  for  the  fusion  center. 

For  any  given  (7o>7i>— >7jv)  €  T0  x  VN ,  consider  the  probabilities  of  error  defined  by 

•7jv(7o,7i,  -,7n)  =  A(7o(7(Vi)»*'*7(vn))  =  2)>  (13) 

«7jv(7o,7i,  -,7Ar)  =  ^»(7o(7(»i),”-7(yAr))  =  1)-  (14) 

Let  us  fix  a  constant  0  belonging  to  (0,1).  We  would  like  to  minimise  7o,-,7jv),  over  all 
7o,  -.,7 N  satisfying 

Jfi (70, 7i»  *  *  * » 7w)  ^  1  —  (15) 

The  optimal  value  of  Jjy  falls  exponentially  with  N  and  we  define 

rw(7o,*",7iv)=  ^  log  (70- --7W)- 


Let 


Rn  =  infr^(7o,---,7Ar), 


(16) 


where  the  infimum  is  taken  over  all  (70,  •  •  •  ,7/v)  €  To  x  TN  satisfying  (15).  We  will  use  the  following 
assumption: 

Assumption  2:  a)  Pj  is  absolutely  continuous  with  respect  to  P\ ; 

b) 

*’b’(S)]=x<001  (17) 

where  dPj/dPi  is  the  Radon-Nikodym  derivative  of  the  two  measures. 

We  define  7'1  and  P?  as  in  Section  V:  71  is  the  <r-field  on  Y  generated  by  7  and  P?  is  the 
measure  Pi  restricted  to  7^ •  The  argument  in  the  proof  of  Proposition  3,  in  Section  V,  applies 
here  and  shows  that  £j(log,(dP,'7dP1'')]  <  A,  V7  €  T.  The  latter  inequality  also  implies  that  there 
exists  some  B  <  00  such  that 


Kiri)  =  E, 


log 


dp; 

dP? 


<b,  V7  6  r. 


(17) 
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The  quantity  /if  (7)  defined  by  equation  (18)  may  be  recognized  as  the  Kullback-Liebler  [KL]  infor¬ 
mation  distance  between  the  distributions  of  the  random  variable  7(y)  under  the  two  alternative 
hypotheses.  It  is  guaranteed  to  be  nonnegative.  Furthermore,  Stein’s  Lemma  [B]  states  that  if  (7) 
is  the  asymptotic  error  exponent  if  all  sensors  are  using  the  same  decision  rule  7  and  if  the  fusion 
center  chooses  70,  according  to  the  Neyman-Pearson  Lemma.  In  light  of  this  fact,  the  following 
result  should  be  expected. 

Theorem  2:  If  Assumption  2  holds,  then 

(i)  limjv_oo  Rn  =  -  supl€r  #(7). 

(ii)  The  value  of  Rff  stays  the  same  if  in  the  definition  (16)  we  impose  the  additional  constraint 
7x  =  •  •  ‘  =  In- 

Proof:  (Outline)  Fix  some  e  >  0  and  let  7'  6  F  be  such  that  K( 7*)  >  sup7er  K{ 7)  -  e. 

Let  the  fusion  center  choose  70  optimally,  subject  to  (15).  From  Stein’s  Lemma,  we  obtain 
limjv_»oo  *>r(7o,7*, "  •  •  ,7*)  =  In  particular,  limsup*^  RN  <  -K{  7*)  <  -  sup76r  if(7)+| 

t.  Since  e  was  arbitrary,  we  conclude  that  limsup/v_>00  RN  <  -  sup76r  K( 7)  and  we  have  shown 
this  bound  to  be  valid  under  the  additional  constraint  7i  =  •  •  -7at- 

In  order  to  complete  the  proof,  it  is  sufficient  to  show  that  for  any  70,  ...,7 jv  satisfying  (15)  we 
have 

1  N 

'Nho,  ”  • ,  7*)  >  ~  -J7  2  K fa)  +  fiN)  ^  ~  auPKb)  +  f(N)>  (19) 

where  /  is  a  function  with  the  property  lim^_oc  f(N)  =  0  and  which  does  not  depend  on 
7o>  •  *  ‘  t'lN-  While  this  result  does  not  follow  from  the  usual  formulation  of  Stein’s  Lemma  (which 
uses  the  Assumption  *f\  =  •  •  •  =  7^),  it  may  be  proved  by  a  small  variation  of  the  proof  of  that 
Lemma,  and  for  this  reason  the  proof  is  omitted.  Suffice  to  say  that  we  may  take  the  proof  of 
Stein’s  Lemma  given  in  [B].  Wherever  in  that  a  proof  convergence  in  probability  of  a  log-likelihood 
ratio  to  its  mean  is  asserted,  we  replace  such  a  statement  with  an  inequality  which  bounds  the 
probability  of  a  deviation  of  a  log-likelihood  ratio  from  its  mean.  Such  an  inequality  is  obtained 
from  Chebychev’s  inequality.  Because  of  (17)  the  variance  of  the  log-likelihoods  of  interest  admits 
the  same  bound,  irrespective  of  the  choice  of  the  7<’s.  For  this  reason,  the  function  /  in  (19)  may 


be  taken  independent  of  the  7*s.  The  proof  is  then  completed  by  taking  the  infimum  of  both  sides 
of  (19),  over  all  70,  and  then  letting  N  tend  to  infinity.  • 

We  continue  with  a  few  observations.  For  simplicity  we  restrict  our  discussion  to  the  case  of 
binary  messages  ( D  —  2). 

It  is  easy  to  prove  that  there  is  no  loss  of  optimality  if  we  constrain  the  7/s  to  correspond  to 
likelihood  ratio  tests  [HVj.  If  we  are  only  interested  in  asymptotics,  the  same  conclusion  may  be 
obtained  from  Theorem  2:  it  is  not  hard  to  show  that  if  a  decision  rule  does  not  have  the  form  of 
a  likelihood  ratio  test,  then  another  decision  rule  can  be  found  for  which  If  (7)  is  even  larger.  This 
leads  to  the  conclusion  that  asymptotically  optimality  is  not  lost  by  assuming  that  each  7 *  consists 
of  a  comparison  of  the  likelihood  ratio  computed  by  that  sensor  with  a  threshold. 

As  is  well-known,  randomization  is  generally  required  in  optimal  hypothesis  testing,  under  the 
Neyman-Pearson  formulation.  For  this  reason,  we  allowed  the  decision  rule  of  the  fusion  center  to 
employ  an  internally  generated  random  variable.  We  may  ask  whether  anything  can  be  gained  by 
allowing  the  sensors  as  well  to  use  randomized  decision  rules.  The  answer  is  generally  positive.  For 
example,  if  N  =  1,  then  the  best  strategy  is  to  let  the  single  sensor  perform  an  optimal  Neyman- 
Pearson  test  (for  which  randomization  is  needed)  and  have  the  fusion  center  adopt  the  decision  of 
the  sensor.  Interestingly  enough,  however,  randomization  does  not  help  asymptotically  as  N  — ►  00, 
which  we  now  prove.  For  any  two  measures  P,  Q  on  (Y,  7),  let  K(Q,P)  =  E^og[dQ / dP)),  where 
the  expectation  is  with  respect  to  Q.  With  this  notation,  K{ 7)  =  K(P^ ,  P?),  V7  G  I*.  It  is  known, 
and  easy  to  show,  that  K(Q,P)  is  a  convex  function  of  {Q,P)-  Suppose  now  that  a  sensor  uses 
a  decision  rule  which  involves  randomization.  The  pair  (P,  ,P^)  of  the  probability  distributions 
of  the  message  transmitted  by  a  sensor  using  a  randomized  decision  rule  7  lies  in  the  convex  hull 
of  such  pairs  of  probability  distributions  corresponding  to  non-randomized  decision  rules.  Using 
the  convexity  pf  K,  it  follows  that  randomization  cannot  help  in  increasing  the  supremum  of  if  (7) 
and,  therefore,  does  not  help  asymptotically. 

From  a  computational  point  of  view,  the  problem  of  this  section  is  a  little  easier  from  the  problem 
of  Section  II,  the  reason  being  that  we  do  not  have  the  additional  free  parameter  s  of  Section  II. 


In  particular,  with  decision  rules  parametrized  by  a  scalar  threshold,  maximization  of  K( 7)  is 
equivalent  to  a  one-dimensional  optimization  problem.  As  there  may  be  multiple  local  optima, 
some  form  of  exhaustive  search  may  be  required. 

As  an  illustration,  we  study  the  performance  of  a  naive  selection  of  the  decision  rule  7  of  each 
sensor.  We  let  each  sensor  perform  a  maximum  likelihood  test  and  transmit  its  decision  to  the 
fusion  center.  This  is  certainly  a  bad  idea  if  N  —  1  because  in  that  the  case  the  sensor  should 
perform  a  Neyman-Pearson  test  which  is,  generally,  different  from  a  maximum  likelihood  test.  Still, 
one  may  wonder  whether  such  a  naive  prescription  has  any  performance  guarantees,  as  N  — *  00. 
The  answer  is  negative,  as  the  following  example  shows.  Let  Pi  and  P2  be  as  in  Figure  1.  A  decision 
rule  7  corresponding  to  a  maximum  likelihood  test  is  to  let  7(1/)  =  1  if  and  only  if  y  >  1/2.  For 
this  choice  of  7,  if  we  assume  that  e  is  small  enough  and  use  a  Taylor  series  expansion  we  obtain 


where  A  is  some  positive  constant.  Let  us  now  consider  the  decision  rule  7  given  by  7(1/)  =  1  if 
and  only  if  y  >  1.  We  then  have  if (7)  =  log(l/(l  -  e/2))  >  c/2  +  Be2,  for  some  constant  B.  We 
conclude  from  this  example  that  the  naive  decision  rule  suggested  above  can  be  far  from  optimal 
(in  terms  of  error  exponent)  by  an  arbitrary  multiplicative  factor. 
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APPENDIX 

We  consider  here  the  problem  introduced  in  Section  H,  with  two  hypotheses  (AT  =  2),  binary 
messages  (D  =  2),  two  sensors  (N  =  2),  and  with  yi,  yj  identically  distributed  and  conditionally 
independent  given  either  hypothesis.  We  present  an  example  which  shows  that  it  is  possible  that 
different  sensors  may  have  to  use  different  decision  rules  even  if  their  observations  are  identically 
distributed.  An  example  of  this  type  was  presented  in  [TeSa].  However,  that  example  used  a  special 
cost  function  which  introduced  a  large  penalty  if  both  sensors  send  the  same  message  and  the  wrong 
decision  is  made  by  the  fusion  center.  Naturally,  this  creates  an  incentive  for  the  sensors  to  try 
to  transmit  different  messages,  and  therefore  use  different  decision  rules.  Thus,  the  asymmetry  of 
the  optimal  decision  rules  of  the  two  sensors  can  be  ascribed  to  this  particular  aspect  of  the  cost 
function  and  does  not  prove  that  asymmetrical  decision  rules  may  be  optimal  for  our  cost  function 
(probability  of  error). 


Our  example  is  the  following.  We  let  Hi  and  H%  be  equally  likely.  The  observations  yi,  yj  are 


conditionally  independent,  given  either  hypothesis,  take  values  in  {1,2,3}  and  have  the  following 
common  distribution: 

P{y  =  11*0  =  4/5,  P(  y  =  2|^)  =  1/5  P(y  =  3\Hi)  =  0, 

P(y  =  l|ff3)  =  1/3,  P(y  =  2\HS)  =  1/3  P(y  =  3|ff3)  =  1/3. 

An  optimal  set  of  decision  rules  may  be  found  by  exhaustive  enumeration.  Since  each  sensor  has 
to  perform  a  likelihood  ratio  test,  there  are  only  two  candidate  decision  rules  for  each  sensor: 

(A)  tii  =  1  iff  y,  =  1, 

(B)  Hi  =  1  iff  y i  e  {1,2}. 

Thus,  we  need  to  consider  three  possibilities:  (i)  both  sensors  use  (A);  (ii)  both  sensors  use  (B); 
sensor  1  uses  (A)  and  sensor  2  uses  (B).  Naturally,  we  assume  that  the  fusion  center  is  using  the 
maximum  a  posteriori  probability  rule. 

Explicit  evaluation  of  the  expected  cost  for  each  possibility  shows  that  the  optimal  set  of  decision 
rules  consists  of  one  sensor  using  decision  rule  A,  one  sensor  using  decision  rule  B  and  the  fusion 
center  deciding  Hi  if  and  only  if  uj  =  u3  =  1,  for  an  expected  cost  of  19/90. 


